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Abstract 



The XOR-satisfiability (XORSAT) problem requires finding an assignment of n Boolean 
variables that satisfies m exclusive OR (XOR) clauses, whereby each clause constrains a subset 
of the variables. We consider random XORSAT instances, drawn uniformly at random from 
^\i , the ensemble of formulae containing n variables and m clauses of size k. This model presents 

several structural similarities to other ensembles of constraint satisfaction problems, such as 
fc-satisfiability (fc-SAT). For many of these ensembles, as the number of constraints per variable 
grows, the set of solutions shatters into an exponential number of well-separated components. 
This phenomenon appears to be related to the difficulty of solving random instances of such 
problems. 
^ \ We prove a complete characterization of this clustering phase transition for random k- 

XORSAT. In particular we prove that the clustering threshold is sharp and determine its exact 
location. We prove that the set of solutions has large conductance below this threshold and that 
^ ■ each of the clusters has large conductance above the same threshold. 

C"""- I Our proof constructs a very sparse basis for the set of solutions (or the subset within a 

[^^ ' cluster). This construction is achieved through a low complexity iterative algorithm. 
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1 Introduction 

An instance of XOR-satisfiability (XORSAT) is specified by an integer n (the number of variables) 
and by a set of m clauses of the form a;j^(i) © • • • © a^ja(fc) = ^a ^or a £ [m] = {!,••• ,rn}. Here 
© denotes modulo-2 sum, b = (61, . . . ,6m) is a Boolean vector, ba £ {0, 1}, given by the problem 
instances, and x = (xi, . . . , x„) is a vector of Boolean variables Xj G {0, 1} that must be chosen to 
satisfy the clauses. 

Standard linear algebra methods allow us to determine whether a given XORSAT instance 
admits a solution, to find a solution, and even to count the number of solutions, all in polynomial 
time. In this paper we shall be interested in the structural properties of the set of solutions 
S C {0, !}"■ of a random A;- XORSAT formula. More explicitly, we consider a random XORSAT 
instance I that is drawn uniformly at random within the set G(n, k, m) of instances with m clauses 
over n variables, whereby each clause involves exactly k variables. The set of solutions S = S(T) 
is then defined as the set of binary vectors x that satisfy all ra clauses. 

Since X is a random formula, 5 is a random subset of the Hamming hypercube. The structural 
properties of S are of interest for several reasons. First of all, linear systems over finite fields are 
combinatorial objects that emerge naturally in a number of fields. Dietzfelbinger and collaborators 



DGM^ 10] use a mapping between XORSAT and the matching problem to establish tight thresholds 



for the performances of Cuckoo Hashing, an archetypal load balancing scheme. Such thresholds 
are computed by determining thresholds above which the set of solutions 5 of a random XORSAT 
formula becomes empty. The existence of solutions is in turn related to the existence of an even- 
degree subgraph in a random hypergraph. Random sparse linear systems over finite fields are used 
to construct capacity achieving error correcting codes |LMSS98t ILMSSOl] IRU08J . The decodability 
of such codes is related to the emergence of a non-trivial 2-core in the same random hypergraph — 
a phenomenon that will play a crucial role in the following. Finally, structured linear systems over 
finite fields are generated by popular factoring algorithms [KAF"'"10j . 

In the present paper we are mostly motivated by the close analogy between random A:-XORSAT 
and other random ensembles of constraint satisfaction problems (CSPs). The prototypical example 
of this family is random /c-satisfiability (/c-SAT). The random A:-SAT ensemble can be described in 
complete analogy to random A;- XORSAT with the modification of replacing exclusive OR clauses 
by OR clauses among variables or their negations. Namely, in /c-SAT each clause takes the form 
(^i (1) "^ ■ ■ ■ ^ ^i (A;))' whereby x^ ,^-, = Xj^(£) or x^ ,^-. = Xi^{^iy An extensive literature [MZK"'"99 



IMPZ031 IANP051 lKMRT+071 IMM091 IACO08J provides strong support for the existence of two 
sharp thresholds in random fe-SAT, as the number of clauses per variable a = m/n grows. First, 
as a crosses a 'satisfiability threshold' as{k), random fc-SAT formulae pass from being with high 
probability (w.h.p. ) satisfiable (for a < as{k)) to being w.h.p. unsatisfiable (for a > as{k)). For 
any a < as{k) the set of solutions is therefore non-empty. However, it undergoes a dramatic 
structural change as a crosses a second threshold ad{k) < as{k). While for a < ad{k), S is w.h.p. 
'well connected' (more precise definitions will be given below), for a £ {ad{k) , as{k)) it shatters 
into an exponential number of clusters. It has been argued that such a 'clustered' structure of 
the space of solutions can have an intimate relation with the failure of standard polynomial time 
algorithms when applied to random formulae in this regime. The same scenario is thought to hold 
for a number of random constraint satisfaction problems (including for instance, proper coloring of 
random graphs, bicoloring random hypergraphs, Not All Equal-SAT, etc.). 

Unfortunately this fascinating picture is so far only conjectural. Even the best understood 
element, namely the existence of a satisfiability threshold as{k) has not been established (with the 
exception of the special case k = 2). In an earlier breakthrough, Friedgut |Fri99| used Fourier- 



analytic methods to prove the existence of a — possibly n-dependent — sequence of thresholds 
as{k;n). Proving that in fact this sequence can be taken to be n-independent is one of the most 
challenging open problems in probabilistic combinatorics and random graph theory. Understanding 
the precise connection between clustering of the space of solutions and computational complexity 
is an even more daunting task. 

Given such outstanding challenges, a fruitful line of research has pursued the analysis of some- 
what simpler models. A very interesting possibility is to study fc-SAT formulae for large but still 
bounded values of A;. As explained in |ACO08j . each SAT clause eliminates only one binary as- 
signment of its k variables, out of 2 possible assignments of the same variables. Hence, for k 
large, a single clause has a small effect on the set of solutions, and most binary vectors are sat- 
isfying unless the formula includes about 2 clauses per variable. This results in an 'averaging' 
effect and suitable moment methods provide asymptotically sharp results for large k. In particular, 
Achlioptas and Peres |AP04| proved upper and lower bounds on as{k) that become asymptoti- 
cally equivalent (i.e. whose ratio converges to 1) as k gets large. Achlioptas and Coja-Oghlan 
[ACOOSl IAC0RT11| . proved that clustering indeed takes place in an interval of values of a below 
the satisfiability threshold and obtained upper and lower bounds on the corresponding threshold 
a(i{k) that are asymptotically equivalent for large k. Finally, Coja-Oghlan |CO10| proved that 
solutions can be found w.h.p. in polynomial time for any a < ad,aig(^), whereby ad,aig(^) is asymp- 
totically equivalent to a^ik) for large k. Intriguingly, no algorithm is known that can provably find 
solutions in polynomial time for a G ((1 + 5)ad{k),as{k)), for any J > 0. 

XORSAT is a very different example on which rigorous mathematical analysis proved possible, 
thus providing precious complementary insights. The key simplification is that the set of solutions 
S is, in this case, an afhne subspace of the Hamming hypercube {0, l}" (viewed as a vector space 
over GF[2]). This implies a high degree of symmetry that can be exploited to obtain very sharp 
characterizations for large n, and any k (we assume throughout that A; > 3, since 2- XORSAT is 
significantly simpler). 

It was proved in [DM02] that, for k = 3, there exists an n-independent threshold as{k) such 
that a random A;- XORSAT instance is w.h.p. satisfiable if a < as{k) and unsatisfiable if a > as{k). 
The proof constructs a subformula, by considering the 2-core of the hypergraph associated with 
the XORSAT instance. One can then prove that the original formula is satisfiable if and only if 
the 2-core subformula is. The threshold for the latter can be determined exactly using the second 
moment method. The proof was extended to all A; > 4 in |DGM"'"10] . 

The existence of a 2-core in a random XORSAT formula has a sharp threshold when the number 
of clauses per variable a crosses a value acoreik). This was argued to be intimately related to the 
appearance of clusters. In particular, |MRTZ03l ICDMM03J give an argumenll^ showing that, above 
ctcoreik), the space of solutions shatters into exponentially many clusters. In other words, acore(A;) 
is an upper bound on the clustering threshold. |MRTZ03] further shows that, for a < Ocoreik), a 
particular coordinate of a solution can be changed by changing 0(1) other variables on average, 
without leaving the space of solutions. If this argument is pushed a step further, one can show 
that, w.h.p. , any coordinate can be changed by flipping at most O(logn) other coordinates. This 
suggests that it may be possible to concatenate a sequence of such flips to connect any two solutions 
via a path through the solution space, with O(logn) steps. However, the analysis |MRTZ03] does 
not imply that this is the case, as it does not address the main challenge, namely to construct 
a path from any solution to any other solution. In this work we solve this problem, and provide 
the first proof of a lower bound of acorc(A;) on the clustering threshold a^, thus establishing that 
indeed ad(A;) = encore (A;)- For a > ad(A;) we prove a sharp characterization of the decomposition 



^The argument of [MRTZ031 ICDMM03] is essentially rigorous, but does not deal with several technical steps. 



into clusters. 

As mentioned above, random A:-XORSAT formulae can be solved in polynomial time using linear 
algebra methods, and this appears to be insensitive to the clustering threshold. Nevertheless, an 
intriguing algorithmic phase transition might take place exactly at the clustering threshold ad{k). 
For any a < ad(^) solutions can be found in time linear in the number of variables (the algorithm 
is in fact an important component of our proof). On the other hand, no algorithm is known that 
finds a solution in linear time for a £ (ad(A;),as(fc))- We think that our proof sheds some light on 
this phenomenon. 

1.1 Main result 

In this paper we obtain two sharp results characterizing the clustering phase transition for random 
yfc-XORSAT: 

(i) We exactly determine the clustering threshold a^ik), proving that the space of solutions is 
w.h.p. well connected for a < a^ik), and instead shatters into exponentially many clusters 
for a £ (ad(A;),as(A:)). 

(a) We determine the exponential growth rate of the number of clusters, i.e. we show that this is 
w.h.p. exp{nS(a; k) + o{n)} where S(a; k) is a non-random function which is explicitly given. 
We prove that each of the clusters is itself 'well connected'. 

This is therefore the first random CSP ensemble for which a sharp threshold for clustering is 
proved. 

Earlier literature fell short of establishing (i) since it did not provide any argument for con- 
nectedness below ad{k). Also, informal calculations only suggested a lower bound on the number 
of clusters, but did not establish (ii) since they did not prove connectedness of each cluster by 
itself. The situation is akin to the analysis of Markov Chain Monte Carlo methods: It is often 
significantly more challenging to prove rapid mixing (connectedness of the space of configurations) 
than the opposite (i.e. to find bottlenecks). 

One important novelty is that the notion of connectedness used here is very strong and goes 
beyond path connectivity, which was used earlier for /c-SAT [ACO08( lACORTll] . We use a properly 
defined notion of conductance which we think can be applied to a broader set of CSP's, and has the 
advantage of being closely related to important algorithmic notions (fast mixing for MCMC and 
expansion). Given a subset of the hypercube S C {0, 1}"", and a positive integer i, we define the 
conductance of 5 as follows. Construct the graph Q{S,i) with vertex set S and an edge connecting 
x,x' £ S if and only if d{x,x') < i (here and below, d{- , • ) denotes the Hamming distance). Then 
we define the i-th conductance of S as the graph conductance of G{S,i), namely 

. CUtg^s,i)iAS\A) 

9(S:i) = ram ,, ;, , ^ , — -r^ , (1) 

^ ' ^ ACS mm{\A\,\S\A\) ' ^ ^ 

where \B\ denotes the cardinality of the set B. Notice that we measure the volume of a set by the 
number of its vertices instead of the sum of its degrees. 

We define the distance between two subsets of the hypercube Si,S2 ^ {0, 1}" as 

d{Si,S2) = min d{x,x/). 
xeSi,x'£S2 



Theorem 1. Let S he the set of solutions of a random A:-XORSAT formula with n variables and 
m = na clauses. For any k > 3, let ad{k) be defined as 

ad{k) = sup {a G [0, 1] : z > 1 - e-'^"^""', Vz G (0, 1)} . (2) 

1. If a < ad{k), there exists C = C(a,k) < oo such that, w.h.p., ^{S; (logn) ) > 1/2. 

2. If a £ (a^ik) , as{k)) , then there exists e = e{k;a) > such that, w.h.p., ^{S;ne) = 0. 

3. If a £ (adik) , as{k)) , and 5 > is arbitrary, then there exist constants C = C{a,k) < oo, 
e = e{a, A;) > 0, S = S(a, k) > 0, and a partition of the set of solutions 5 = 5i U • • • U Sn, 
such that, w.h.p., the following properties hold: 

(a) For each a G [N], we have ^{Sa, (logn)'^) > 1/2. 

(b) For each a^ b £ [N], we have d{Sa,Si,) > ne. 

(c) exp{n(S — (5)} < N < exp{n(S + 5)}. Further, letting Q be the largest positive solution 
of Q = 1 — exp{—kaQ''~^} and Q = Q^~^ , we have E(a, k) = Q — kaQ + {k — l)aQQ. 

1.2 Conductance and sparse basis 

We will prove Theorem [1] by obtaining a fairly complete description of the set S both above and 
below ad(fc). In a nutshell, for a < a^ik), S admits a sparse basis, while for a > a^ik) each 
of the clusters Si,..., Sjy admits a sparse basis but their union does not. This is particularly 
suggestive of the connection between the clustering phase transitions and algorithm performance. 
Below ad{k) the space of solutions admits a succinct explicit representation (in 0(n(logn)'^) bits). 
Above ad{k), we can produce a representation that is succinct but implicit (as solutions of a given 
formula), or explicit but prolix (no basis is known that can be encoded in o{n'^) bits). 

Given a linear subspace S C {0, 1}"", we say that it admits an s-sparse basis if there exist vectors 
x}^' G S for I G {1, . . . ,-D} such that d{x}^',0) < s and (ai''')^Q form a basis for S. The latter 
means that the vectors are linearly independent and 5 = { ^;=i aix}'"' : (a/)^o ^ i^' ^}^ }■ 

We say that an affine space S C {0, 1}" admits a sparse basis if there exists x}^^ G S such that 
the linear subspace S — x}^' admits a sparse basis. The property of having a sparse basis indeed 
implies large conductance. The proof is immediate. 

Lemma 1.1. // the affine subspace S Q {0, 1}" admits an s-sparse basis, then ^{S; s) > 1/2. 
Vice versa, assume that ^{S; s) = 0. Then S does not admit an s-sparse basis. 

Proof of Lemma \1.1[ We can assume, without loss of generality, that 5 is a linear space. Let d be 
its dimension. Further, given a graph Q, let, with a slight abuse of notation 

^ g =mm . ^7^, ,^ ' , (3 

ACS mm{\ A\,\S\A\) 

so that ^{S;£) = <^{g{S;£)). 

Assume that S admits an s-sparse basis. This immediately implies the graph g{S,s) contains 
a spanning subgraph that is isomorphic to the d-dimensional hypercube Tid. Further Q i— )• ^(G) 
is monotone increasing in the edge set of G. Therefore $(5;s) > ^(%d) > 1/2 where the last 
inequality follows from the standard isoperimetric inequality on the hypercube |HLW06J . D 

The characterization of the solution space in terms of sparsity of its basis is given below. 



Theorem 2. Let S he the set of solutions of a random A:-XORSAT formula with n variables and 
m = na clauses. For any k > 3, let ad{k) be defined as per Eq. (0j. Then the following hold: 

1. If a < ad{k), there exists C = C{a,k) < oo such that, w.h.p., S admits a (log n) -sparse 
basis. 

2. If a €z {a(i{k),as{k)), and 5 > is arbitrary, then there exist constants C = C{a,k) < oo, 
£ = e{a, fc) > 0, S = $](a, k) > 0, and a partition of the set of solutions 5 = 5i U • • • U Sn, 
such that, w.h.p., the following properties hold: 

(a) For each a G [N], Sa admits a {log n)'^' -sparse basis. 

(b) For each a^h ^ [N\ we have d{Sa,Si,) > ne. 

(c) exp{n($] — 5)} < N < exp{n(S + 5)}. Further, S is given by the same expression given 
in TheoremU^ 

Clearly, this theorem immediately implies Theorem [T] by applying Lemma II. 1[ The rest of this 
paper is devoted to the proof of Theorem [2j 

1.3 Outline of the paper 

In Section [2] we define some basic concepts and notations. Section [3] describes the construction of 
clusters and sparse bases, and uses this construction to prove Theorem [2j Several basic lemmas 
necessary for the proof are stated in this section. 

Section S] introduces a certain belief propagation (BP) algorithm and a technical tool called 
density evolution, that play a key role in our analysis: The BP algorithm naturally decomposes the 
linear system into a 'backbone' (consisting roughly of the 2-core and the variables implied by it) 
and a 'periphery'. Density evolution allows us to track the progress of BP, eventually facilitating a 
tight characterization of basic parameters (like number of nodes) of the backbone and periphery. 

Section [5] bounds the number of iterations of a 'peeling' algorithm (related to BP) that plays 
a key role in our construction of a sparse basis. Section [6] proves a sharp characterization of the 
periphery. Together, this yields the first (large) set of basis vectors. 

Section [7] shows the 2-core has very few sparse solutions, leading to well separated, small, 'core- 
clusters'. Section [8] shows how to a produce a sparse solution of the linear system corresponding to 
each sparse solution of the 2-core subsystem. This yields the second (small) set of basis vectors in 
our construction. 

Several technical lemmas are deferred to the appendices. 

2 Random A:-XORSAT: Definitions and notations 

As described in the introduction, each fc-XORSAT clause is actually a linear equation over GF[2]: 
^ia{i)®"'®^ia{k) = ba, iov a £ [m] = {1, • • • ,m}. Introducing a vector /i^j E {0, 1}", with non-zero 
entries only at positions ii{a),... ,ik{a), this can be written as haX = ha. Hence an instance is 
completely specified by the pair (H, 6) where H G {0, 1}™^" is a matrix with rows hi,. . . ,h^ and 
6 = (5i, . . . , bmf G {0, 1}"^. The space of solutions is therefore S = {x£ {0, 1}" : Hx = 6 mod 2}. 
If S has at least one element xy^' , then S © xy^' is just the set of solutions of the homogeneous 
linear system corresponding to 6 = (the kernel of H). In the following we shall always assume 
a < as{k), so that S is non-empty w.h.p. |DGM"'"10] . Since we are only interested in properties of 
the set of solutions that are invariant under translation, we will assume hereafter that 6 = 0. 



An XORSAT instance is therefore completely specified by a binary matrix H, or equivalently 
by the corresponding factor graph G = {F, V, E). This is a bipartite graph with two sets of nodes: 
F [factor or check nodes) corresponding to rows of H, and V [variable nodes) corresponding to 
columns of H. The edge set E includes those pairs (a, z), a G -F, i G F such that Wiai = 1- We 
denote by G(n, k, m) the set of all graphs with n labeled variable nodes and m labeled check nodes, 
each having degree exactly k (with no double edges). Note that \G[n,k,m)\ = (^) . With a slight 
abuse of notation, we will denote by G[n, k, m) also the uniform distribution over this set, and 
write G ~ G(n, A;, m) for a uniformly random such graph. 

For v G y or v G F, we denote by degQ[v), the degree of node v in graph G (omitting the 
subscript when clear from the context) and we let dv denote the set of neighbors of v. We define 
distance with respect to G between two variable nodes i, j G V, denoted by dG[i,j) as the length 
of the shortest path from i to j in G, whereby the length of a path is the number of check nodes 
encountered along the path. Given a vector x, we denote by x^ = [xi)i^A its restriction to A. The 
cardinality of set A is denoted by |^|. 

We only consider the 'interesting' case k > 3, and the asymptotics m, n — )• oo with m/n — )• a 
and a G [0,as[k)), where as[k) is the satisfiability threshold. Hence IH has w.h. p. maximum rank, 
i.e. rank(]HI) = m |MM09j . 

Definition 2.1. Let Fq C F. The subgraph induced by Fq is defined as [Fq,Vo,Eq) where Vq = 
{i £ V : di n Fq ^ ^} and Eq = {[a,i) £ E : a G Fq,* G Vq}. A check-induced subgraph is the 
subgraph [Fq,Vq,Eq) induced by some Fq C F. Similarly, we can define the subgraph induced by 
Vq ^ V, and variable-induced subgraphs. 

Let Fq '^ F, Vq CI V. The subgraph induced by [Fq,Vq) is defined as [Fq,Vq,Eq) where 
EQ = {[a,i) £E: a £ Fo,i £ Vq}. 

Definition 2.2. A stopping set is a check-induced subgraph with the property that every variable 
node has degree larger than one with respect to the subgraph. The 2-core of G is its maximal 
stopping set. 

Notice that the maximal stopping set of G is uniquely defined because the union of two stopping 
sets is a stopping set 

A key fact to be used in the following is that a giant 2-core appears abruptly at a^[k). Forms 
of the fohowing statement appear in |LMSS981 IMoin5[lDMn8] . 

Theorem 3 f |LMSS98[[Mol05llDM08] 1. Assume a < aA[k) . Then, w.h.p., a graph G r^ G[n,k,m) 
does not contain any stopping set. 

Vice versa, assume a > ad[k). Then there exists C[k) > such that, w.h.p., a graph G drawn 
uniformly at random from G[n, k, m) contains a 2-core of size larger than C[k)n. 

Finally, we will often refer to the depth-t neighborhood of a node v in G. 

Definition 2.3. Given a node v £ V and an integer t, let V = {u : u £ V, ddu, v) < t}. Then the 
ball of radius t around node v is defined as the (variable-induced) subgraph SQ[v,t) induced by V' . 
With an abuse of notation, we will use the same notation for the set of variable nodes in BG'(v,t). 
Lastly, we define |BG(t;,t)| to be the number of variable nodes in the subgraph 6^(7;, t). 

3 Proof of Theorem [2] 

In this section we describe the construction of clusters and sparse bases within the clusters (or for 
the whole space of solutions for a £ [0, ad(A;)) ). The analysis of this construction is given in Section 



Synchronous Peeling 


{Graph G^{F,V,E)) 


F'4-F 






V ^V 






E' ^ E 






Jo^iF,V,E),t 


= 




While Jt has a variable node of degree < 1 do 


t ^t + 1 






Vti-{v£ V 


: degG,_ 


,H<1} 


Ft^{ae F' 


: {v, a) e E' for some v e Vt} 


Et^{iv,a)i 


EE':a£Ft,ve V'} 


F' ^ F'\Ft 






V 4- V'\Vt 






E' ^ E'\Et 






Jt^{F',V', 


E') 




End While 






Tc-^t 






Gc^G' 






Return {Gc,Tc, 


= iFt)I=i 


, m)?li,(^*)£i) 



Table 1: Synchronous peehng algorithm 

in terms of a few technical lemmas. Finally, the formal proof of Theorem [2] is given in Section 

3.1 Construction of the sparse basis 

The construction of a sparse basis, which is at the heart of Theorem [21 is based on the following 
algorithm, formally stated in Tabled) The algorithm constructs a sequence of residual factor graphs 
{Jt)t>o, starting with the instance under consideration Jq = G. At each step, the new graph is 
constructed by removing all variable nodes of degree one or zero, their adjacent factor nodes, and 
all the edges adjacent to these factor nodes. We refer to the algorithm as synchronous peeling or 
simply peeling. 

We denote the sets of nodes and edges removed at step (or round) t > 1 by {Ft, Vt,Et), so that 
Jt-i = {Ft,Vt,Et) U Jt- Notice that, at each step, the residual graph Jt is check- induced. The 
algorithm halts when the residual graph does not contain any variable node of degree smaller than 
two. We let the total number of iterations be Tc{G), where we will drop the explicit dependence 
on G when it is clear from context. The final residual graph is then Jtc = Gc- The following 
elementary fact is used in several papers on this topic |LMSS98t IMol05l IDM08] . 

Remark 3.1. The residual graph Gc resulting at the end of synchronous peeling is the 2-core of G. 

It is convenient to reorder the factors (from 1 to m) and variables (from 1 to n) as follows. 
We index the factors in increasing order according to -Fi , F2 j • • • ; Ftc 1 choosing an arbitrary order 
within each Ft for 1 < t < Tq. 

For the variable nodes, we first index nodes in Vi, then nodes in V2 and so on. Within each set 
Vt, the ordering is chosen in such a way that nodes that have degree in Jt-i have lower index 
than those with degree 1 (notice that, by definition, for any v G Vj, degjj_j(f) < 1). Finally, for 
variable nodes in Vt that have degree 1 in Jt-i, we use the following ordering. Each such node 
v G Vt is connected to a unique factor node in Ft . Call this the associated factor, and denote it by 



fy. We order the nodes degj^_^{v) = 1 according to the order of their associated factor, choosing 
an arbitrary internal order for variable nodes with the same associated factor. 

For j4 C F, i? C y, we denote by M^^^b the submatrix of H consisting of rows with index a € A 
and columns i £ B. The following structural lemma is immediate, and we omit its proof. 

Lemma 3.2. Let G be any factor graph (not necessarily in G{n,k,m)) with no 2-core. With the 
order of factors and variable nodes defined through synchronous peeling, the matrix M is partitioned 
in Tq X Tc blocks {]HIi?^^v't}i<s<rc,i<t<7c ^^^^ i^^^ following structure: 

1. For any s > t, ^.p^yt = 0. 

2. The diagonal blocks Mp^y^, have a staircase structure, namely for each such block the columns 
can be partitioned into consecutive groups (C/);^q, for £ = \Fs\, such that columns in Cq are 
equal to 0, columns in Ci have only the first entry equal to 1, columns in C2 have only the 
second entry equal to 1, etc. See below for an example. 



An example of a staircase matrix. 



110 

10 

110 

1 



(4) 



Note that Vt is not empty and Ft is not empty for all t < Tq. On the other hand, Ft^. may be 
empty, in which case, we adopt the convention that all columns corresponding to Vt^ are included 
in Co. 

The above ordering reduces EI to an essentially upper triangular matrix. It is then immediate 
to construct a basis for its kernel. We will do this by partitioning the set of variable nodes as 
the disjoint union V = U UW m. such a way that U S {0, l}™^™ is square with full rank, and 
W G {0, i}™x("--™). We then treat as x_^r as independent variables and Xjj as dependent ones. The 
partition is then constructed by letting W = Wi U • • • U Wtc and [/ = C/i U • • • U Utc , whereby for 
each t £ {1, . . . ,7c}, Wt '^ Vt is chosen by considering the staircase structure of block Mptyt ^^d 
the corresponding partition over columns Vt = CqUCiU- • -UC^. We let Wt = CqUCJu- • -UC^, where 
Cj- includes all the elements of C, except the first (and is empty if \Ci\ = 1). Finally Ut = Vt\ Wt. 
With these definitions, Mp^jj is an m x m binary matrix with full rank. In addition, it is upper 
triangular with diagonal blocks M.p^^i/^ = I\u^\ for t = 1, . . . , Tc, where Ir is the rxr identity matrix. 

In order to construct a sparse basis for the clusters when a > ctd{k) (and hence prove Theorem 
[21 point 2. (a)), we will have to consider matrices EI (without a 2-core) that contain rows with 
exactly 2 non-zero entries (i.e., check nodes of degree 2). Whenever this happens, the construction 
must be modified, by introducing the notion of collapsed graph. The basic idea is that a factor 
node of degree 2 constrains the adjacent variables to be identical and hence we can replace all of 
them by a single variable. 

Definition 3.3. The collapsed graph G* = (F*, Vi,£^*) of a graph G = {F,V,E) is the graph of 
connected components in the subgraph induced by factor nodes of degree 2. Formally, 

F, ^ {feF:\df\>3}, 

V; = {S ^V : dQ(2){i,j) <oo,\/i,j £ S}, 

E^ = {{S,a) : a £ F^, \{i £ S s.t. (i, a) £ E}\ is odd }, 

where G^'^' is the subgraph of G induced by factor nodes of degree 2. We let n^, = \V^\, m^ = |-F*|. 
An element of V^ is referred to as a super-node. 
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Note that for a graph G with no 2-core, the cohapsed graph G=k also has no 2-core. We let Q 
denote the corresponding adjacency matrix of G*. Finally, we construct a binary matrix L with 
rows indexed by V ^ and columns indexed by K, and such that Lj^^ = 1 if and only if i belongs 
to connected component v. We apply peeling to G*, thus obtaining the decomposition of V^, into 
t/* U W^: as described for the original graph G above. 

The following is the key deterministic lemma on the construction of the basis. We denote the size 
of the component of u G K in G^"^' by S{v), and for u G i;*, t > we let S{v, t) = YliweB (v t) '^(^) 
be the sum of sizes of vertices within distance t from v. 



Lemma 3.4. Assume that G* has no 2-core, then the columns of 

L 



-' (n, — m, ) X (n, — m» ) 



form an s-sparse basis of the kernel ofM, with s = raaxy^^y, 5'(u*,Tc(G*)). Here we have ordered 
the super-nodes ?;* G V* as t/* followed by VF*, and the matrix inverse is taken over GF[2]. 

The proof of Lemma 13.41 is presented in Appendix |X1 

3.2 Construction of the cluster decomposition 

When G does not contain a 2-core (which happens w.h.p.for a < ad{k)) the above lemma is 
sufficient to characterize the space of solutions S. When G contains a 2-core (w.h.p. for a > ad{k)) 
we need to construct the partition of the space of solutions 5i U • • • U Sj\[ . 

We let Gc = {Fc, Vc, Ec) denote the 2-core of G, and Pq : {0, 1}^ -^ {0, l}^'^ be the projector 
that maps a vector x to its restriction Xy^. Next, we let He = IH[i?j,y(, be the restriction of IH to the 
2-core, and denote its kernel by 5c. Obviously, for any x£S,we have Pg^ G 5c. Further 

5 = U^^es, 5(xc) , 5(xc) = {x G 5 : Pgx = Xc}. (5) 

with {S{x^)}x eSc forming a partition of 5. 

It is easy to check Mp\p^y\Y^ has full row rank. For instance, this follows from the fact that 
the subgraph induced by {F\Fc,V\Vc) is annihilated by peeling (c.f. Remark l3.ip . Thus, 5(xc) is 
nonempty for all Xq G 5c, and the sets 5(xc) are simply translations of each other. 

It turns out that {S{xq)}x g5c i^ ^ot exactly the partition of 5 that we seek. In our next 
lemma, we show that the set of solutions of the core 5c can be partitioned in well-separated core- 
clusters. Moreover, the core-clusters are small and have a high conductance. We will form sets in 
our partition of 5 by taking the union of 5(xc) over x^ that lie in a particular core-cluster. 

We write x' ^ x for binary vectors x',xii x[ < Xi for all i. We write x^ ^ xif x' < x and x' / x. 
We need the following definition: 

Cc{e) = {x:x€ 5c(G),d(x,0) < i, $x! G 5c(G)\{0} s.t. x' -< x} (6) 

The set Cc{i) consists of minimal nonzero solutions of the 2-core having weight at most i. (Here 
the support of a binary vector x is the subset of its coordinates that are non-zero, and the weight 
of X is the size of its support.) 

Lemma 3.5. For any a G {ad{k),as{k)), there exists e = e{a,k) > such that the following 
holds. Take any sequence (sn)n>i such that lim„_j.oo s„ = oo and s„ < en. Let G ~ G(n, /c,an). 
Then, w.h.p., we have: (i) Cci^n) = Cc{sn); (H) \Cc{sn)\ < Sn', {Hi) For any x,x' G Cc{sn), we 
have X A x' = , where A denotes bitwise AND. In other words, different elements of Cc{en) have 
disjoint supports. 
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Lemma [331 is proved in Section [71 

Remark 3.6. Let E be the event that points (i), (ii) and (in) in Lemma \3.5\ hold. Assume E and 
s^ < en. Let 5c, i be the set of core solutions with weight less than en. Then 5c. i forms a linear 
space over GF(2) of dimension \Cc{en)\, with Cc{en) being a Sn-sparse basis for 5c,i. Moreover, 
every element of Sc,i is s'^-sparse. 

Let 

g = 2l^'=(^")l . (7) 

We partition the set 5c of core solutions in disjoint core- clusters, as follows. For x,x' £ 5c, we write 
^ ~ x' if x©x' G span{Cc{en)) . It is immediate to see that ~ is an equivalence relation. We define 
the core-clusters to be the equivalence classes of ~. Obviously the core clusters are affine spaces 
that differ by a translation, each containing g < 2^" solutions. Their number is to be denoted by 
N. Denote the core-clusters by 5c,i, 5c,2, • • • , <Sc,n- Note that for any x,x' G 5c belonging to 
different core-clusters, we have d{x,x') > ne, i.e., the core-clusters are well separated. We use the 
following partition of the solution space (including non-core variables) 5 into clusters, based on the 
core-clusters defined above: 

N 

S = \JSi, Si = {xeS: PgxG Sc,i} ■ (8) 

j=l 

A version of Lemma [33] was claimed in jMRTZOSt ICDMM031 IMM09] . These papers capture 
the essence of the proof but miss some technical details, and make the erroneous claim that, w.h.p. 
, each pair of core solutions is separated by Hamming distance Q{n). 

We next want to study the internal structure of clusters. By linearity, it is sufficient to consider 
only one of them, say 5i, which we can take to contain the origin 0. For any x G Si, we have 
Pgx £ '5c, 1 = span{Cc{en)), and Cc{en) forms a s„-sparse basis for 5c,i, which coincides with the 
projection of 5i onto the core. Consider the subset of solutions x G 5, such that Pg^ = Xq for 
some Xc G 5c, i. The set of variables that take the same value for all solutions in this set is strictly 
larger than the 2-core. In order to capture this remark, we define the backbone (variables that are 
uniquely determined by the core assignment) and periphery (other variables) of a graph G. 

Definition 3.7. Define the backbone augmentation procedure on G with the initial check induced 
subgraph G\^ as follows. Start with G\^ . For any t > pick all check nodes which are not in G\^ 
and have at most one neighbor outside G)^ . Build G)^ by adding all these check nodes and their 
incident edges and neighbors to G\^ . If no such check nodes exist, terminate and output Gb = G^ . 

The backbone G-r = (-Fb, ^, Eb) of a graph G = {F, V, E) is the output of backbone augmentation 
procedure on G with the initial subgraph Gc, the 2-core of the graph G. 

The periphery Gp of a graph G = {F, V, E) is the subgraph induced by the factor nodes Fp = 
F\ Fb and variable nodes Vp = y \ Vb that are not in the backbone]^ 

We can now define our basis for 5i. This is formed by two sets of vectors. The first set has 
a vector corresponding to each element of jCc{£:n). For each x^ £ Cc{en), we construct a sparse 
solution X € Si such that Pgx = Xq (Lemma 13.81 below guarantees the existence of such a vector, 
and bounds its sparsity). This set of vectors forms a basis for the projection of 5i onto the backbone. 



^Notice that there may be a few variables (w.h.p. at most a constant number) in the periphery that also are 
uniquely determined by the core assignment. 
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For the second set of vectors, let Hp = HiT'p^v'p be the matrix corresponding to the periphery- 
graph. We construct a sparse basis for the kernel of the matrix Hp, following the general procedure 
described in Section 13.11 Namely, we first collapse the graph and then peel it to order the nodes. 
Note that this second set of basis vectors vanishes on the backbone variables. Lemma 13.41 is used 
to bound its sparsity. 

The first set of vectors is characterized as below (see Section [8] for a proof). 

Lemma 3.8. Consider any a S (od (A;), Og (/;;)). Let G he drawn uniformly from G{n,k,m). Take 
e{a,k) > from Lemma \3. 5\ and consider any sequence {cn)n>i such that lim^^oo Cn = oo. Then, 
with high probability, the following is true. For every x^ £ Cc{en), there exists Cn-sparse x £ Si 
such that Pgx = Xq . 

3.3 Analysis of the construction 

The main challenge in proving Theorem [2] is bounding the sparsity of the bases constructed (either 
for the full set of solutions, when G does not have a core, or for the cluster Si, when G has a core). 
This involves two type of estimates: the first one uses Lemma 13.41 while the second is stated as 
Lemma [3.8[ In the first estimate, we need to bound all the quantities involved in the sparsity upper 
bound: the number of iterations T after which peeling (on the collapsed graph G*) halts, and the 
maximum size max„gv; S{v, T) of any ball of radius T in the collapsed graph. In particular we will 
show that, w.h.p. , we have T = O(loglogn), and that max^g\4 S{v,T) < (logn)*^ w.h.p. , which 
gives sparsity s < (logn)*^. 

Proving these bounds turns out to be a relatively simpler task when G does not have a 2- 
core, partly because the graph in question has no factor nodes of degree 2, and thus the collapse 
procedure is not needed. A second reason is that when G has a 2-core, we need to apply Lemma [3.41 
to the periphery subgraph as discussed above. Remarkably, the periphery graph admits a relatively 
explicit probabilistic characterization. We say that a graph is peelable if its core is empty, and hence 
the peeling procedure halts with the empty graph. It turns out that, conditional on the degree 
distribution, the periphery is uniformly random among all peelable graphs. 

Such an explicit characterization is not available, however, when we consider the subgraph 
obtained by removing the core (the periphery is obtained by removing the entire backbone). Nev- 
ertheless, the proof of Lemma [3.81 requires the study of this more complex subgraph. We overcome 
this problem by using tools from the theory of local weak convergence |BS96[ IAS03|, IAL07J . 

Given a graph G = (F, V, E), its check-node degree profile R = {Ri)i,=f<i is a probability distri- 
bution such that, for any / G N, mRi is the number of check nodes of degree /. A degree profile R 
can conveniently be represented by its generating polynomial R{x) = "^i^qRix . The derivative of 
this polynomial is denoted by R'{x). In particular R'{1) = Yl,i>o^^i i^ the average degree. 

Given integers m, n, and a probability distribution R = {Ri)i<k over {0, 1, ... , k}, we denote 
by D{n,R,m) the set of check-node- degree- constrained graphs, i.e., the set of bipartite graph with 
m labeled check nodes, n labeled variable nodes and check node degree profile R. As for the model 
G(n, k, m) we will write G ^ D(n, R, m) to denote a graph drawn uniformly at random from this 
set. Note that we have restricted the checks to have degree no more than k. Further, we will only 
be interested in cases with Rq = Ri ={). 

Lemma 3.9. Let G = {F, V, E) ~ G{m, n, k) and let G-p he its periphery. Suppose that with positive 
probability, G-p has rip variable nodes, nip check nodes, and check degree profile -RP. Then, condi- 
tioned on Gp G B(np, i?P, rup), the periphery Gp is distributed uniformly over the set ID)(np, R^,mp)r\ 
V, where V is the set of peelable graphs. 
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There is a small technical issue here in that if G' G B(np, i?P,?7Zp), then variable nodes in G' 
have labels from 1 to rip, whereas Gp has variable node labels that form a subset of {1, 2, . . . , n}, 
and similarly for check nodes. We adopt the convention that the variable and check nodes in 
Gp are relabeled sequentially, respecting the original order, before comparing with elements of 
D(np,i?P,mp). 

The above lemma establishes that the periphery is roughly uniform, conditional on being pee- 
lable. Its proof is in Section El 

Conceptually, we will bound the sparsity, as estimated in Lemma 13.41 by proceeding in three 
steps: (1) Bound the estimated basis sparsity max„ S{v, T) for check node degree constrained graphs 
lD){n, R,m), in terms of the degree distribution; (2) Estimate the 'typical' degree distribution for 
the periphery, and prove concentration around this estimate; (3) Prove that, if R close to the 
typical degree distribution, then G ~ B(n, R, m) is peelable with uniformly positive probability. 
The latter allows to transfer the sparsity estimates from the uniform model D(n, R, m) to the actual 
distribution of the periphery. 

Lemma 13.111 below accomplishes steps (1) and (3), while Lemma 13.121 takes care of step (2). In 
order to state these lemmas, it is convenient to introduce density evolution (the terminology comes 
from the analysis of sparse graph codes |LMSS98l ITMSSOli [RU08] .1 

Definition 3.10. Given a > Q, a degree profile R, and an initial condition zq G [0,1], we define 
the density evolution sequence {zt}t>o by letting for any t > 1, 

l-exp{-ai?'(zi_i)}. (9) 

Whenever not specified, the initial condition will be assumed to he zq = 1. The one- dimensional 
recursion ^ will he also called density evolution recursion. 

We say the pair (q, R) is peelable at rate tj for r] > if zt < {1 — ?/)*/f? for all t > 0. We say 
that the pair (a, R) is exponentially peelable (for short peelablej if there exists r/ > such that it 
is peelable at rate rj. 

The density evolution recursion ([9]) describes the large graph asymptotics of a certain belief 
propagation algorithm that captures the peeling process, and will be described Section HI 
The next lemma is proved in Section [H 

Lemma 3.11. Consider the set in){n,R,an), where R = {Ri)i<k is a check degree profile such 

that Rq = Ri = 0. Assume that the pair {a,R) is peelable at rate r]. Then there exist constants 

Nq = NQ{ri,k) < oo, S = 5{r],k) > 0, Ci = Gi{r],k) < oo, C2 = G2{r],k) < 00 such that the 

following hold, for G a random graph drawn from II>{n,R,m) with n > Nq: 

(i) The graph G is peelable with probability at least 6. Further, if R2 = 0, one can take 6 arbitrary 

close to 1 (in other words G is peelable w.h.p. ). 

(ii) Conditional on G being peelable, peeling on the collapsed graph G.^ terminates after T < 

Ci log log n iterations, with probability at least 1 — n~^'^. 

{Hi) Letting Tub = [CiloglognJ, we have max^gv4 S{v,T^\^) < (logn)'-^^^ yjiiji probability at least 

l-n-V2. 

Our final lemma is proved in Section [6] and establishes the peelability condition for the periphery. 

Lemma 3.12. For any a > a^ there exist constants tj = r]{k,a) > 0, 7,, = j^{k,a) > such that 
the following holds. Let G = {F, V, E) be a graph drawn uniformly at random from the ensemble 
G{n,k,m), m = na, and let Gp = {Fp,Vp,E-p) be its periphery. Let rup = \Fp\, rip = \Vp\, 
ap = mp/rip and denote by R^ the random check degree profile of Gp. Then, for any e > Q, w.h.p. 
we have: {i) The pair {ap,R^) is peelable at rate rj; {ii) 77,(7* — e) < ^p < "-(7* + e)- 
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3.4 Putting everything together 

At this point we can formally summarize the proof of our main result, Theorem [21 that builds on 
the construction and analysis provided so far. 

Proof (Theorem\^. 1. For a < ad{k), w.h.p. , the graph G does not contain a 2-core (cf. Theorem 
[3]), hence peeling returns an empty graph. Using the construction in Lemma 13.41 we obtain an s- 
sparse basis, with s = max^gy |B(u, Tc)\ (notice that in this case there is no factor node of degree 2 
and hence the collapsed graph coincides with the original graph). The number of peeling iterations 
Tc is bounded by Lemma r3.11l (iz). using the fact that, by definition of ad{k) the pair (a, R), with 
i?fc = 1 is peelable at rate r] = ri{a, k) > for a < a^ik). Hence Tc < Ci log log n w.h.p. , for some 
Ci = Ci{a,k) < oo. Finally, by applying Leuiui& \3.11\ (iii) we obtain the thesis. 

Next consider point 2. The partition into clusters is constructed as per Eq. ([8]), and in particular 
the number of clusters N is equal to the number of solutions of the core linear system Hex = 
divided by g given by Eq. ([7]). Let us consider the various claims concerning this partition: 

2. (a) By construction, it is sufficient to construct a basis of the cluster Si containing the origin, 
cf. Section [3.21 The basis has two sets of vectors. 

The first set of vectors is given by Lemma |3.8[ Their projection onto the core spans the core 
solutions in 5c, i. Since variables in the backbone are uniquely determined by those on the core, 
their projection onto the backbone spans the backbone projection of Si. By Lemma 13.81 these 
vectors are, w.h.p. , c„-sparse for any c„ — ?• oo. Lemma 13.41 provides the second set of vectors. 
These span the kernel of the adjacency matrix of the periphery, Hp and vanish identically in the 
backbone. In particular, they are independent from the first set. It is easy to check that the two 
sets of vectors together form a basis for the cluster Si . 

We are left with the task of proving that the second set of basis vectors is sparse. The con- 
struction in Lemma [3^ proceeds by collapsing the periphery graph Gp, and applying peeling. We 
thus need to bound the sparsity s = max^gy S{v,Tc). Define the event 

El = {(op, R^) is peelable at rate ?? > and n? > 727* /2} . 

By Lemma [3. 121 we know that Ei holds with high probability for suitable choices of r/ = ri{k, a) > 
and 7^, = 7* (a, k) > 0. Further Rq = R^ = with probability 1. 

Prom Lemma 13.91 we know that Gp is drawn uniformly from the set I]){np,R^ ,mp) V. Let 
G' drawn uniformly from D(np, i?^,?TT,p), with {np,R^,mp) distributed as for Gp, conditional on 
(ap,i?P) E El. We can then apply Lemma [3.111 to G'. From point (i), it follows that G' is peelable 
with probability at least 6 = 6{a,k) > 0. Let G^ be the result of collapsing G'. From points (ii) 
and (iii) it follows that, with probability at least 1 — rip ' > 1 — (n7*/2)"'^'^ — )■ 1 as n — )• 00, we 
have max^gv;/ 5'G'(f ,Tc) < max^gv;' 5^/(1; ,Tub) < (logn)'^ , for some G = G{a,k) < 00. (We use 
the subscript on S to indicate the graph under consideration.) 

Since Ei holds for Gp w.h.p. , and since G' is peelable with probability uniformly bounded away 
from zero, it follows that the same bound on the sparsity holds for Gp as well. In other words, 
w.h.p. , we have that 



max Sgp{v,Tc) = (logn 

v£Vp, 



,C 



Here, Vp^* is the set of super-nodes resulting from the collapse of Gp. Finally, using Lemma 13^ 
we deduce that the second set of basis vectors obtained from this construction is s-sparse for 
s = (logn)*-^. 
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2.(6) By Lemma [331 w.h.p. , for any two core solutions Xq G 5c, i, Xq € 5c,6, 6 7^ 1 we have 
(i(X(;;,Xc') > ne. This immediately implies d{x,x') > ne, for any two solutions x G Si, x^ £ S \Si. 
By linearity, we conclude d{Sa,Sb) > ne for all a, 5. 

2.(c) Let A'^c be the number of solutions of the core linear system Hcx = 0. This was proved 
to concentrate on the exponential scale in |DM02l |DGM"'"10] . with n(S — e) < log A'^c < n{Ti + e) 
with high probability, and S given as in the statement (cf. also |MM09) ). The number of clusters 
is A^ = Nc/g for g = 2^'^'^"-', cf. Eq. ([7j). Using the bound |£c(£^)| < Sn from Lemma 13.51 {ii) 
and choosing s„ to diverge sufficiently slowly with n, we deduce that A^ also concentrates on the 
exponential scale with the same exponent as A'^c- D 

4 A belief propagation algorithm and density evolution 

A useful analysis tool is provided by a belief propagation algorithm that refines the peeling algorithm 
introduced in Section 13.11 The same algorithm is also of interest in iterative coding, see |RU08[ 
IMM09I 

Given a factor graph G = {F,V,E), the algorithm updates 2\E\ messages indexed by directed 
edges in G. Li other words, for each (v, a) G E, v £ V and a £ F, and any iteration number i E N, 
we have two messages I'l^a^ ^^^ ^a^v> taking values in {0,*}. For t > 1, messages are computed 
following the update rules 

if ^IZ^y = * for all b G dv\a, ^^^^ 



otherwise. 



and 



-t ^ I if i/*^„ = for all u £ da\v, ,^^. 

"■^'" \ * otherwise. 

The initialization at t = depends on the context, but it is convenient to single out two special 
cases. In the first case, all messages are initialized to 0: I'^^a ~ ^a^v ~ ^ fo'^ ^^^ (^' '^) ^ ^- ^^ the 
second, they are all initialized to *: I'^^a — ^a->t; = * fo'^ ^11 (O)"*^) £ E. We will refer to these two 
cases (respectively) as BPq and BP,,, . We let z/* = {i^l-^a){a,v)GE and 2* = {''^l^a)ia,v)eE denote the 
vector of messages. 

We mention here that BP* on the a graph G G G(n, k, m) turns out to be trivial (all messages 
remain *). However, we find it useful to run BP* on the subgraph induced by variable and check 
nodes outside the core. We describe this in detail in Section 14.21 

The belief propagation algorithm introduced here enjoys an important monotonicity property. 
More precisely, define a partial ordering between message vectors by letting ;^ * and iy_ ^ i/ if 
J^v^a h ^'v^a and da-,v h i^a->v for all (a, v) G E. 

Lemma 4.1 ( |RU08[ IMM09J ) . Given two states i/i >z E3 "^^ have lA^ >z 1^2 '^''^d 2* ^ uk ^^ 0.^^ 
t' > t. 

As a consequence, the iteration BPq is monotone decreasing (i.e. 1/^^ :< 1/) and BP* is 
monotone increasing (i.e. i/^^ ^ i/). In particular, both converge to a fixed point in at most \E\ 
iterations. 

It is not hard to check by induction over t that BPq corresponds closely to the peeling process. 
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Lemma 4.2. A variable node v is eliminated in round t of peeling, i.e., v ^Vt , if there is at most 
one incoming message to v in iteration t — 1 of BPq but this was not true in previous rounds. A 
factor node a is eliminated in round t of peeling (i.e., a E F^), along with all its incident edges, if 
it receives a * message for the first time in iteration t of BPq . 

Further, the fixed point of BPq captures the decomposition of G into core, backbone and 
periphery as follows. 

Lemma 4.3. Let {u^ , P°°) denote the fixed point o/BPq . For v £V, we have 

• V ^ Vc if and only if v receives two or more incoming messages under v^ , 

• V £ Vb\Vc if and only if v receives exactly one incoming message under d^, 

• V £ Vp if and only if v receives no incoming messages under ^°° . 
For a £ F, we have 

• a £ Fc if and only if a receives no incoming * message under i/^ , 

• a £ F^\Fc if and only if a receives one incoming * message under i^°°, 

• a £ Fp if and only if a receives two or more incoming * messages under iy_°° . 

Finally, Gq is the subgraph induced by {Fc,Vc) and similarly for G-q and Gp. 

The proofs of the last two lemmas are based on a straightforward case- by-case analysis, and we 
omit them. (In fact, this correspondence is well known in iterative coding, albeit in a somewhat 
different language [RU08] .) 

4.1 Density evolution 

It turns out that distribution of BP messages is closely tracked by density evolution, in the large 
graph limit. Before stating this fact formally, it is useful to introduce a different ensemble C(n, R, m) 
that will be used in some of the proofs. A graph G in C(n, R, m) is constructed as follows. We 
label variable nodes 1 through n and check nodes 1 through m. We choose an arbitrary partition 
of the m check nodes into k + 1 sets with the /th set consisting of mRi check nodes with degree / 
each, for / = 0,1, ... ,k. For each check node of degree I we draw I half-edges distinct from each 
other. Each of these half-edges is connected to an arbitrary variable node. 

There is a close relationship between the sets D(n, i?, m) and C{n,R,m). Any element of 
3{n, R,m) corresponds to Y[i=2(^^)"^^' elements of C{n,R,m), with the ambiguity arising due 
to the ordering of the neighborhood of a check node in C{n,R,m). Conversely, any element of 
C(n, R, m) with no double edges (two or more edges between the same (variable, check) pair) 
corresponds to a unique element of B(ra, i?, tti). Moreover, the fraction of elements of C{n,R,m) 
that have no double edges is uniformly bounded away from zero as n — t- oo [B0I8O] . This leads to 
Lemma 14.41 below. 



Lemma 4.4. Let E be a graph property that does not depend on edge labels (for example, E(G) = 
{G is a tree}). There exists C = C(A;,amax) < 00 such that the following is true for any a £ 
[0, amax]- Suppose E holds with probability 1 — e for G drawn uniformly at random from C(n, R, an), 
for some e £ [0, 1] . Then E holds with probability at least 1 — Ce for G' drawn uniformly at random 
from B(n, R, an) . 
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An important tool in the following will be the notion of local (weak) convergence of graph 
sequences |BS96j . We make this notion precise here, following [DMlOj . 

Definition 4.5. Let Gn = {Fn,Vn,En) be a sequence of factor graphs. Let P„ denote the law of 
the ball BG„{v,t) when v G Vn is uniformly random. We say that {Gn} converges locally to the 
random, rooted tree T if, for any finite t and any rooted tree To of depth at most t, 

hm P„{BG„(t;,t) ~ To} = P{r(t) ~ To} (12) 

n— >oo 

holds almost surely with respect to the graph law. We say that {Gn} converges locally on average 
to T if Eq. [Wjj holds in expectation, i.e. 

lim Eg„P„{5g„(^, t) ~ To} = nnt) ^ %} (13) 

n— )-oo 

holds for any finite t and any rooted tree To of depth at most t. 

We now return to the distribution of BP messages and density evolution. 

Lemma 4.6. Let {zt} be the density evolution sequence defined by ^, for a given polynomial 
R, with zq = 1, and define zt = R' (zt) / R' (1) . Assume G ~ B(n,T?, m) or G ^ C{n,R,m) with 
m = na. 

Let R^ '^ be the fraction of check nodes receiving Iq incoming messages and Z* incoming * 

messages after t iterations of BPo . Similarly, let Lj^ ^ the fraction of variable nodes receiving Iq 
incoming messages and /* incoming * messages after t iterations of BPo . 
Then for any fixed 6 > 0, t > the following occur with high probability: 

l^!ol-^fe+'*(^°^^*)^t"(l-^t)'*l<'^ /«^ Zo,/.e{0,l,...,A:}, (14) 

\L[2i^-F{Xo = lo,X, = m<5 for lo,U£-MU{0}, (15) 

where Xq ~ Poisson(i?'(l)a£t), X^, ~ Poisson(i?'(l)a(l — fj)) are two independent Poisson random 
variables. 

Proof. Notice that both I}{n,R,m) and C{n,R,m), m = na converge locally to unimodular bi- 
partite trees. More precisely, if rooted at random variable nodes, they converge to Galton- Watson 
trees with root offspring distribution Poisson(T?'(l)Q) at variable nodes, and equal to the size-biased 
version of R at check nodes (the proof is analogous to the one of [DM10] ). Further, messages are 
local functions of the graph, hence their distribution converges to the one on the limit tree. In par- 
ticular, incoming messages on the same node are asymptotically independent because they depend 
on distinct subtrees. The message distribution can be computed through a standard tree recursion 
(see |RU08t IMM09J ) that coincides with the density evolution recursion ([9]) . D 

Using the correspondence in Lemma 14.21 between BPo and the peeling algorithm, we can use 
density evolution to track the peeling algorithm. 

Lemma 4.7. Given a factor graph H , let ni(H) denote the number of variable nodes of degree 1, 
and n2^{H) the number of variable nodes of degree 2 or larger in H . For I G N, let mi(H) be the 
number of factor nodes of degree I in H. 
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Consider synchronous peeling for t > 1 rounds on a graph G ~ ©(n, R, an) or G ^ C(n, R, an), 
with Rq = Ri = 0, and let Jt denote the residual graph after t iterations. Let uj = aR'{l). Then 
for any 5 > 0, there exists Nq = Nq(6, k,t, a) such that with probability at least 1 — 1/n^ 



mi{Jt 



n 



aRiZi 



ni(Jt) 



n 



Lozt exp(-(:jii) (l - exp(-a;(zt_i - zt)) ) 
n2+{Jt) 



n 



1 + exp(-a;z() (l + ujzt] 



<5 for / E {2,3,...,/c} (16) 

<5, (17) 

<5. (18) 



Proof. For the sake of simplicity let us consider ni(Jt). By Lemma 14.21 a node v has degree 1 in 
the residual graph Jt if and only if there is one incoming message to v at time t, and there were 
two or more incoming messages to v at time t — 1. By Lemma 14.61 the number of incoming 
messages to v at time t converges in distribution to Zi ~ Poisson(a;zt)- Using monotonicity of the 
algorithm, and again Lemma 14.61 the number of incident edges such that the message incoming 
to V at time t — 1 is but changes to * at time t, converges to Z2 ~ Poisson(6t;(z(_i — zt)), and 
is asymptotically independent of the number of messages (converging to Zi). Therefore ni^t/n 
converges as n — )• 00 to 

P[Zi = 1]P[Z2 > 1] = uzt exp(-a;zt) (l - exp(-w(zt_i - zt)) . 

This establishes that the estimate (I17p holds with high probability. In order to obtain the 
desired probability bound, one can use a standard concentration of measure argument |RU081 
IDP09| . Namely, we first condition on the degrees of the check nodes. Since the unconditional 
distributions B(n, R, m) and C(?7-, i?, m) are recovered by a random relabeling of the check nodes, 
such conditioning is irrelevant. We then regard ni{Jt) as a function of the independent random 
variables Xi, ... Xm whereby Xa is the neighborhood of the a-th check node. We denote by E the 
event that all the balls 6^(7;, 2t) of radius t in G have size smaller than (logn) . We have 



E{ni(Ji)|Xi, . . .,Xa-i,Xa; E} - E{ni(Ji)|Xi, . . .,Xa-i,X',; E} < (logn 



\C 



The desired probability estimate then follows by applying Azuma's inequality (in a form that allow 
for exceptional events, see for instance |iDP09t Theorem 7.7]) and bounding P(E'^) (see for instance 
Section [OD. D 



4.2 BP fixed points 

For our purposes, it is important to characterize the fixed point of the BPq algorithm introduced 
above. Indeed, the structure of this fixed point is directly related to the decomposition of G into 
core, backbone and periphery (cf. Lemma [4. 3p . which is in turn crucial for our definition of clusters. 
Let us start from an easy remark on density evolution. 

Lemma 4.8. Let {zt\t>Q be the density evolution sequence defined by Eq. ^ with initial condition 
zq = 1. Then t ^ zt is monotone decreasing, and hence has a limit Q = liuit^oo zt which is given 
by 



Q = sup {zs.t. z = 1 — ex.p{—aR'{z)}^ . 



(19) 
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Proof. Monotonicity follows from the fact that z i— )• f{z) = 1 — exp{—aR' (z)} is monotone increas- 
ing, and that zi = 1 — exp{— ai?'(l)} < zq, whence Z2 = /(zi) < f{zo) = zi, and so on. D 

Notice that the definition of Q given in this lemma is consistent with the one in Theorem [H 
that corresponds to the special case of regular, degree-A; check nodes, i.e. R{x) = x^. We further 
let Q = R'{Q)/R'{1). 

We know that both BPq and density evolution converge to a fixed point. Since density evolu- 
tion tracks BPq for any bounded number of iterations, it would be tempting to conclude that a 
description of the BPq fixed point is obtained by replacing zt by Q and zt by Q in Lemma [4. 61 This 
is, of course, far from obvious because it requires an inversion of the limits n — )• oo and t — )• oo. 
Despite this caveat, this substitution is essentially correct. 

Lemma 4.9. Assume G ~ G(n, A;, m) with m = na, and a S [0, ad{k)) U (adik), oo). 

Let Rii be the fraction of check nodes receiving Iq incoming messages and l^, incoming * 

messages at the fixed point of BPq . Similarly, let ^^ ; the fraction of variable nodes receiving Iq 
incoming messages and /* incoming * messages at the fixed point of BPq . 
Then for any fixed 5 > 0, t > the following occur with high probability: 

I^K! - iT) Q'"(l - Q)'* \^^ for loe {0,1..., k},k = k-k (20) 

|l[~;-P{Xo = /o,^* = ^*}| <<^ for lo,hGN, (21) 

where Xq ~ Poisson (/caQ), X* ~ Poisson(/i;a(l — Q)) are two independent Poisson random vari- 
ables. 

Given Lemma 14.61 above. Lemma 14.91 says that the messages change very little beyond a large 
constant number of iterations. A hint at the fact that Lemma 14.91 is significantly more challenging 
than Lemma 14.61 is given by the assumption in the former that a ^ adik). In fact, this turns out 
to be a necessary assumption, because it implies an important correlation decay property. 

MoUoy |Mol05] established the analog of Eq. (f2T]l for J2i„>2 e,>o -^lu-i which corresponds to 
the relative size of the core. We find that the complete theorem presents new challenges: keeping 
track of the backbone turns out to be hard. One hurdle is that the 'estimated backbone' after t 
iterations of BPq (i.e. the subset of variable nodes that receive exactly one message) does not 
evolve monotonically in t. In contrast, the 'estimated core' (i.e. the subset of variable nodes that 
receive two or more message) which can only shrink. Another hurdle is that, unlike the periphery 
(cf. Section [6]), it turns out that the backbone is not uniformly random conditioned on the degree 
sequence. 

The proof of Lemma 14.91 is quite long and will be presented in Section 14.31 The basic idea is 
to run BP starting from the initialization with messages coming from vertices in the core and * 
messages everywhere else. This corresponds to BP.,, on the non-core Gnc (i-e., the subgraph induced 
by {F\Fc,V\Vc)), since messages outside the non-core do not change: Messages within the core 
and from core variables to non-core checks stay fixed to 0. Messages from non-core checks to core 
variables stay fixed to *. We refer to this algorithm simply as BP* , with the understanding that 
BP=K is actually run on Gnc- 

Denote by Vv^a the messages produced with this initialization, and Vv^a the messages produced 
by BPq . Monotonicity of BP update implies i^v^a ^ i^v^a t i^v^a- The proof consists in showing 
that the fraction of messages in {i'v^a}{a,v)eE is, for large fixed t, close to the fraction of 
messages in {i^v'-^a} {a,v)€E ■ The challenge is that no theorem of the form 14.61 is available for BP^, . 

Our final lemma is a straightforward consequence of Lemmas 14.61 and 14.91 above. 
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Lemma 4.10. Consider any k >3, any a S (0, ad) U {a^, Os) cLnd any 5 > 0. There exists T < oo 
such that the following occurs. Let G = {F,V,E) ~ G{n,k,an). Then, w.h.p. , the fraction of 
(check-to-variable or variahle-to- check) messages that change after iteration T of BPq is smaller 
than 5. 

Proof. Let N^{n) be the fraction of variable-to-check messages that are equal to after t iterations 
on G ~ G(n, k, na) (with t = oo corresponding to the fixed point). Then Eqs. (fT4|l and (|20l) imply 
that (redefining 5) 

\NHn)-zt\<^-, \N^(n)-Q\<^-. 

Using Lemma 14.81 there exists T large enough so that, for t >T, |z* — Q\ < 5/3. By the triangle 
inequality, \N^{n) — N°°{n)\ <5. The thesis follows since, by monotonicity of BPq , N^{n) — N°^{n) 
is exactly equal to the fraction of messages that change value from iteration t to the fixed point. 
The case of check-to-variable messages is analogous. D 

4.3 Proof of Lemma 14.91 

Throughout this section, the notion of local convergence adopted is convergence locally on average 
(cf. Definition [33]). For the sake of simplicity, we will sometimes drop the specification 'on average'. 
Consider Eq. (j2ip . Since the total number of incoming messages is equal to the vertex degree, 
which is Poisson(A;a), it is sufficient to control the distribution of incoming messages. In particular, 
we define 

oo oo 

e,=oif,=i 

that is the fraction of nodes that receive i or more incoming messages. 
We prove a series of lemmas, leading to the desired estimate for Lj,!. 
An upper bound on L^ is relatively easy to obtain. 

Lemma 4.11. Fix i >0. Then, with high probability, we have 

4^^ < P{Poisson(fcaQ) > £} + S . 

Proof. From Lemma 14.11 it follows that L^!_ is monotone decreasing. Using Lemma 14.61 we have, 
w.h.p., L^^ < L\]^ < P{Poisson(A;azi) > f\ -\-5/2. But Lemma WM implies, for t large enough 
P{Poisson(A;ai() > £} < P{Poisson(A;a(5) > f\ + 5/2 which implies our claim. D 

The lower bound on L^ cannot be obtained by the same approach. We go therefore through 
a detour. 

Let Hif{n) = H^[n, k, an) be the random rooted factor graph with marks (called a 'network' in 
[AL07] ) . constructed as follows: Draw a graph uniformly at random from G{n,k,an). Choose a 
uniformly random variable node i £ V as root. Mark variable nodes with mark c if they are in the 
2-core. 
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Lemma 4.12. The sequence {H^,{n)}n>o converges locally on average to the random rooted tree 
with marks, %:{oi,k), defined as follows. Construct a random bipartite Galton-Watson tree rooted at 
with offspring distribution Poisson(fca) at variable nodes and deterministic k — 1 at factor notes. 
Let Vc(7^) be the maximal subset of its vertices such that each variable node has degree at least 2 
and each factor node has degree k in the induced subgraph. Mark with c all vertices in Vc(7^). 

Proof. It is immediate to see that the sequence {i?*(n)}„>o is tight, i.e. that for any e > there 
exists a compact set /C such that F{H^{n) G /C} > 1 — e. (For instance take /C to be the set of graphs 
that have maximum degree Af at distance t for a suitable sequence t i— >• Af.) Therefore [AL07] . 
any subsequence of {i?*(n)} admits a further subsequence that converges locahy weakly to a limit, 
which is itself a random rooted network. This subsequence can be constructed through a diagonal 
argument: First construct a subsequence {-ff*(n*)}s>o such that the depth-t subtree converges. 
Refine it to get a subsequence {ff^,(n*"'"^)}s>o such that the depth-(t + 1) subtree converges and so 
on. Finally extract the diagonal subsequence {H^{nl)}s>o. 

We will prove the thesis by a standard weak convergence argument [Kal02j : We will show that 
for any subsequence of {H^{n)}n>o, there is a subsubsequence that converges locally weakly to 
%:{a,k). 

Consider indeed any subsubsequence that converges locally weakly to limiting random rooted 
graph with marks, which we denote by 0=,,. Define the unmarking operator U that maps a marked 
rooted graph to the corresponding unmarked rooted graph. We have that 11(0*) = D{%) (here 

= denotes equality in distribution) from local weak convergence of random graphs to Galton- 
Watson trees (see, e.g. [AS03t iDMlOj ). We will hereafter couple the two trees in such a way that 

u(a) = u(r*). 

Recall that a stopping set any subset of variable nodes of a factor graph, such that each variable 
node has degree at least 2 in the induced subgraph. The 2-core of the factor graph is the maximal 
stopping set and is a superset of any stopping set. These notions are well defined for infinite graphs 
as well. 

Now, the marks in %, correspond to the core by definition. The marks in O* form a stopping 
set, since O* is the local weak limit of H^{n), and in H^:{n), a vertex is marked only if at least 
two of its neighboring checks have all marked neighboring variable nodes. Moreover, one can show 
that both 7Z and O* are unimodular. Indeed 71 is unimodular since the unmarked tree is clearly 
unimodular, and the marking process does not make any reference to the root. Unimodularity of 
O^K is clear since it is the local weak limit of a marked random graph [AL07J . Thus, in order to 
prove our thesis it suffices to show that the density of marks is the same in 7^ and O,,, . (Because 
the subset of nodes that is marked in 7^ contains the subset marked in O* and the density of their 
difference is equal to the difference of the densities. Finally, for unimodular network, if a mark type 
has density 0, then the set of marked nodes is empty by union bounds.) 

Let the set of marked vertices in O* be denoted by Vc(0*)- The density of marks in O^: is given 
by 

P{0 G Fc(0*)} = P{Poisson(A;aQ) > 2} . (22) 

where Q and Q are defined as at the beginning of Section 14. 2i This was proved in [MolOSj which 
proves that indeed the right hand side is the asymptotic normalized size of the 2-core in G. 
Proceeding analogously to the proof of [BPP061 Proposition 1.2], we obtain 

E Vc{%)} = P{Poisson(A;aQ) > 2} . (23) 
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The sketch of this step is the following. Let Ef be the event that belongs to a 'depth t core', 
where the requirement of "degree at least 2 in the subgraph" applies only to variables up to depth 
t — 1. The probability on the left had side is just P{E} for E = nt>iE(. Since E^ is a decreasing 
sequence P{E} = linif^oo PjEj}. On the other hand PjEj} can be computed explicitly through a 
tree calculation and converges to P{Poisson(a/cQ) > 2} as t — ?• oo yielding (|23|) . 

Finally, the thesis follows by comparing Eq. ()22p and (j23p . D 



We next construct a random tree T*{a, k) with marks on the directed edges as follows. Marks 
take values in {0, *} and to each undirected edge we associate a mark for each of the two directions. 
We will refer to the direction towards the root as to the 'upwards' direction, and to the opposite 
one as to the 'downwards' direction. The marks correspond to fixed point BP messages, and 
we will call them messages as well in what follows. First consider only edges directed upwards. 
This is a multi-type GW tree. At the root generate Poisson(/ca) offsprings, and mark each of 
the edges to independently with probability Q, and to * otherwise. At a non-root variable 
node, if the parent edge is marked 0, generate Poisson(A;a(l — Q)) descendant edges marked * and 
Poisson>i(A;aQ) descendant edges marked (here PoissonE(A) denotes a Poisson random variable 
with parameter A conditional to E). If the parent edge is marked *, generate Poisson(A;a(l — Q)) 
descendant edges marked * and no descendant edges marked 0. At a factor node, if the parent edge 
is marked 0, generate k — 1 descendant edges marked 0. If the parent node is marked *, generate 
M ~ Binom<fc_2(A; — 1, Q) descendants marked 0, and k — 1 — M descendants marked *. 

For edges directed downwards, marks are generated recursively following the usual BP rules, cf. 
Eqs. (jlOp . (jlip . starting from the top to the bottom. It is easy to check that with this construction, 
the marks in 7^(a, k) is constructed correspond to a BP fixed point. 

We extend the unmarking operator U by allowing it to act on graphs with marks on edges (and 
removing the marks). 

Lemma 4.13. U(7^) and U(7^) have the same distribution. 

Proof. For this we construct \J{T*) (which is T* without the marks revealed) in a 'breadth first' 
manner as follows: First we draw a Poisson(afc) number of factor descendants for the root node. 
Let a be a factor descendant of the root. Then a has k — \ variable node descendants. The message 
i>a^0 is with probability Q. It immediate to check from our construction and Q = Q that: 

Fact 1: Conditional on the degree of the root deg(0) = di, the di{k — 1) upwards messages 
incoming to the check nodes a £ d0 are independent, with P{z/t,_^a = 0} = Q. 

Now, we draw the number of descendants for each neighbor of a. Using Fact 1, together with 
the definition of T, one can check that: 

Fact 2: Conditional on the degree of the root deg(0) = di, the number of descendants of each of 
the di{k — \) variable nodes v at the first generation is an independent Poisson(A;a) random variable. 
Further, the upwards messages towards these variable nodes are independent with Pjrfub^u = 0} = 
Q. 

This argument (outline for simplicity for the first generation, can be repeated almost verbatim 
at any generation. Denote by T*.t the first t generations of 7^^( (with variable nodes at the leaves). 
One then proves by induction that at any t, conditional on U(7^.i), the number of descendants of 
the variable nodes in the last generation are i.i.d. Poisson(A;a), and given these, the corresponding 
upwards messages are i.i.d. P{n\i;,^^ = 0} = Q- This implies the thesis. D 

Lemma 4.14. Tj, is unimodular. 
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Proof. We already established unimodularity of U(7^) (since U(7^) = U(7^) is a unimodular Galton- 
Watson tree). To establish the claim, let T** be the random tree whose distribution has Radon- 
Nykodym derivative deg(0)/E{deg(0)} with respect to that of 71. We need to show that moving 
the root to a uniformly random descendant variable node of the root (via one check) in Xt* , leaves 
the distribution of 7^^, unchanged (cf. |AL071 Section 4]). 

Draw Tli* at random, weighted by the degree of the root 0. In this argument, we make the root 
explicit by denoting the tree by (7^*, 0). Reveal the degree di = deg(0) of the root. We have di > 
with probability 1. Take a uniformly random neighboring check a G d0, and a uniformly random 
descendant i of a (we know that a has k — 1 descendants). Reveal the number of descendants of 
i. Let this number be ^2 — li so that i has d2 neighbors in total. Note that we do not reveal any 
of the messages in 71*. At this point, consider the incoming messages to the variable nodes and 
i except for 1)0^0 and Oa^i, and the incoming messages to the check a except for v^^a and Ui^a- 
Call this vector of messages M. The messages in M are independent, with probability Q of for 
each incoming message to variable nodes to be 0, and probability Q for incoming messages to a to 
be Qj. The messages z>a^0, I'a^i, i^0^a and Vi^a are deterministic functions of M. Finally, notice 
that di and d2 are independent, and identically distributed as 1 + Poisson(a/c). At this point, it is 
clear that (%:*,i) is distributed identically to (7^*,0), which establishes unimodularity. D 

Lemma 4.15. Let f be a map from 'trees with marked edges' to 'trees with marked variable nodes' 
defined as follows: f{T) is obtained from T by putting a c mark on vertex i if and only if at least 
two incoming edges have a mark. 

Then F{%{a,k)) = %{a, k). 

Proof. It is easy to check that the subset of variable nodes in 7^ that receive two or more incoming 
O's forms a stopping set (since the set of messages is at a BP fixed point). But the density of 
marked nodes in F(7i) (i.e., the probability of the root being marked) is P|Poisson(A;a(5) > 2|, 
which exactly the same as the density of marked nodes in T* (Recall that T* is also unimodular, cf. 
proof of Lemma l4.12p . On the other hand, the set of marked nodes in 7'* is the core by definition 
and hence includes the marked nodes in F(7^). We deduce that the set of vertices that are marked 

in % but not in F(Ti^) has vanishing density and therefore F(7Z(a, k)) = T{a., k). D 

We let B be the subset of variable nodes v of T*{a, k) such that at least one message incoming 
to V is equal to 0. Then this set has density 

P{0 G 5} = P{Poisson(fcaQ) >l] =Q. (24) 

In light of Lemma I4.15[ we further denote the set of variable nodes in T* having two or more 
incoming messages by Vc(7^). 

The following is immediate from the construction of 7^. 

Remark 4.16. We have £ B if and only if there exists a subtree of T* rooted at with the 
following properties: (i) If j is a variable node in the subtree, either j £ Vc(llf) or at least one 
descendant factor node is in the subtree; (ii) If a is a factor node in the subtree, all its descendants 
are also in the subtree. 

We call the subtree just defined a witness for (there might be more than one in principle). 
We will focus on minimal witnesses, which in particular, do not contain descendants of nodes in 
Vc(7^). It is not hard to prove that the minimal witness is unique. Notice that a priori a witness 
can be finite (if it ends up with nodes in Vc{%)), or infinite. 



^The argument establishing this is essentially the one above, where we showed that U(7^) = U{Tl) 
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Lemma 4.17. Almost surely any node i & B has a finite witness. 

Proof. It is sufficient to prove that the following event has zero probability: £ B and only has 
infinite witnesses. Suppose £ B. If G Vc(7^), then it is itself a witness and we are done. If not 
then there is exactly one incoming message, say from factor a. Then factor a has k — 1 incoming 
messages from descendants. The subtrees corresponding to these descendants are independent. 
Consider a descendant i of a. We have 

P(i G B\Vc{%)) = F{Poisson>i{akQ) = 1} 

= exp{—akQ)akQ / {1 — exp(— a/cQ)) 
= ex.p{-akQ)akQ''~'^ . 

Conditioned on i G B\Vc{%:), the node i has exactly k — 1 descendant variable nodes (via one check 
node). Thus, conditioned on £ B, the minimal witness is a Galton Watson tree with offspring 
distributed as Z, whereby Z = {k — 1) with probability exp{—akQ)akQ^~^ , and Z = otherwise. 
The branching factor of this tree is ex.p{—akQ)ak{k — 1)Q^~'^ < 1 (cf. Lemma 16.61 below) . The 
lemma follows. D 



Lemma 4.18. Consider the setting of Lemma \4-9\ Denoting by E expectation with respect to 
G ~ G(n, k, no), we have 

lim inf ElJ"^^ > P{Poisson(/caQ) > 1}. 



Proof. Consider running BP* on 7^ (recall that this is BP with the initialization coming from 
vertices in Vc{T*) and * otherwise). Let Bt be the subset of variable nodes that receive at least 
one message after t iterations. Let yt be the density of nodes in Bt. From Lemma 14.171 we have 
immediately 

lim yt = P{Poisson(fcaQ) > 1} . (25) 

t— >(X) 

For G = {F,V,E) ~ G(n,fc,an), let Bt{n) C T/ be the subset of nodes having at least one 
incoming after t iterations of BP^, . Let yt{n) be the fraction of these nodes, i.e. yt{n) = \Bt{n)\/n. 
From Lemma 14.121 we have 

lim Eytin) = yt . (26) 

?i— s>oo 

where expectation is taken with respect to G ~ G(n, k, an). By Eq. ()25p . we have lim„_>oo ^ytin) > 
P{Poisson(/ca(5) > 1} — 5 for all t > T{6). By monotonicity of BP,,, , we have liminf„_i.ooIEL^°^ > 
lim,„^oo IEy((n) > P{Poisson(A;a(5) ^ 1} ~ ^) which implies the thesis. D 



Lemma 4.19. Consider the setting of Lemma \4.9\ Denoting by E expectation with respect to 
G ~ G(n, k, na), we have, for all i > 2, 

lim inf EL^^^ > P{Poisson(A;aQ) > £} . 
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Proof. The proof is very similar to that of the previous lemma. For G = {F, V, E) ~ G(n, A;, an)., 
let C{i; n) QV he the subset of variable nodes that are in the core and have at least £ neighboring 
check nodes in the core. Then we have (by monotonicity of BP^, ) 

Lj~>>M^. (27) 

On the other hand, let y{i) be the density of variable nodes in 7^ that receive two or more 
messages and have at least l neighboring check nodes in the set 

{a : For each i £ da, node i receives two or more messages } . 

It follows from Lemma 14.121 and Lemma 14.151 that 

lim inf -E{\C{£;n)\} = y{£). (28) 

n— >oo n 

On the other hand, it is easy to check that the construction of %, implies that y{i) coincides 
with the density of nodes receiving i or more messages (here the assumption ^ > 2 is crucial). 
Hence y{£) = P{Poisson(A;a(5) > £}, which, together with Eq. ([27]), ([IHD yields the thesis. D 

Proof of Lemma \4-9[ We will limit ourselves to proving Eq. (j2ip , since (j20p follows from a com- 
pletely analogous argument. In view of Lemma 14.111 and discussion above, it is sufficient to prove 
that L^~^ > P{Poisson(A;a(3) > £} - 6 with high probability. Let Pg = P{Poisson(/i;aQ) > £} 

and define Z = Pi — L^ + £ for some e > 0. By Lemmas 14.181 and 14.191 we have E{Z} = 
E{Z_|_} — E{Z_} < 2e for all n large enough (here x+ = max(a;,0), x_ = max(— a;,0)). Using 
\Z\ <2 and taking n sufficiently large, we get by Lemma 14.111 

E{Z+} <2£ + E{Z_} <2£ + 2¥{Z < 0} < 3e . (29) 

Hence, by Markov inequality, for all n > Nq^e) 

P{4r^ <Pi-S]=F{Z>6 + e}< ^^ < -^ . (30) 

Since £ is arbitrary, it follows that PJL^'^ < -P^ — (5} — >■ which proves the claim. D 

5 Proof of Lemma 13.11b Peelability implies a sparse basis 

5.1 Proof of Lemma 13.111 (i) and (n) 

Let us begin by describing the proof strategy. 

Instead of analyzing peeling on the collapsed graph G=k, we analyze a different peeling process. 
We first run synchronous peeling on G for a large constant r number of iterations. We then collapse 
the resulting graph, as discussed in Section 13. H i.e. coalescing variables connected to each other 
via degree 2 factors (cf. Definition 13. 3p . Finally, we run synchronous peeling on the collapsed 
graph until it gets annihilated. We show that this process takes at least as many iterations as 
synchronous peeling on G* (Lemma 15.11 below). In order to bound the number of iterations under 
this new two-stages process, we proceed as follows. We choose the constant r such that the residual 
graph Jr is subcritical and hence consists of trees and unicyclic components of size O(logn) w.h.p. 
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. As a consequence, the collapsed graph -to be denoted by T( J^)- contains only checks of degree 
3 or more, and consists of trees and unicyclic components of size O(logn). It is not hard to show 
that it takes only O(loglogn) additional rounds of peeling to annihilate T(Jt-) under this condition 
(see Lemma 15.41 below). 

Several technical lemmas follow, which are proved in Appendix [Bl except Lemma lS.H which we 
prove below. At the end of the subsection, we provide a proof of Lemma [3. Ill parts (i) and (ii). 

Consider the peeling algorithm and define J to be the peeling operator corresponding to one 
round of synchronous peeling (cf. Table [1]). Thus, for a bipartite graph G, the residual graph 
after t rounds of peeling is J*(G). Denote by J°°(G) the graph produced by the peeling procedure 
after it halts: this is the empty graph if G is peelable, and the core of G otherwise. Recall that 
Tc{G) denotes the number of rounds of peeling performed by J°°(G') before halting. Further, define 
T to be the collapse operator as per Definition 13.31 For instance G* = T(G). The next lemma 
bounds from above the number of rounds of peeling required to annihilate G*, in terms of the 
modified peeling process (consisting of r rounds of peeling, followed by collapse, and then peeling 
until annihilation). 

Lemma 5.1. For any constant r > and any peelable bipartite graph G, 

Tc(T(G))<rc(T(J-(G))) + T. 

Proof. First, note that for bipartite graphs Gi C G2 and for any t > 0, J*(Gi) C J*(G2), i.e., 
the peeling algorithm preserves the partial ordering on graphs defined by Gi ^ G2 if and only if 
Gi C G2. Therefore, to prove Lemma l5. II it is enough to show that for any graph G and t > 0, 
J*(T(G))CT(J*(G)). 

Consider a bipartite graph G = (F, V, E) and its corresponding collapsed graph T(G) = 
(-F*, K, -E*). For i £ V, let t{i) be the iteration in which the variable node i £ V is peeled in 
G. Similarly, let t^{i) be the iteration in which the supernode i* E K (with i £ i^) is peeled 
in T(G) (if i G Gc, then we take t{i) = 00 and similarly for i*(i)). Define the inverse mapping 
T~^ : y — >■ Kf as j G T~^{i) if and only ii j £ V and i and j are connected in G through a path 
containing only check nodes of degree 2, i.e., i and j belong to the same supernode in T(G). Define 
di{G) to be the neighborhood of node i in graph G. We prove the lemma by showing that 

tS) < max t{j) Vi £ V. (31) 

jGT-i(i) 

In words, Eq. ([3T]) states that J* peels a supernode in T(G) in at most as many steps as J* peels all the 
corresponding variable nodes, T~^(z), in G. We prove Eq. (f3T]l by induction over maXjgT-i(i) t{j)- 

• Induction base: For max,gj-i(j) t{j) = 1, all variable nodes in the given supernode are leafs, 
and thus |T~^(i)| = 1 or 2 as otherwise there must exist a variable node of degree 2 or more in 
the supernode, which would then not be peeled in the first round in G. If |T~^(i)| = 1, then 
T~^(i) = {i} and t*(i) = t{i) = 1. If |T~^(i)| = 2 then both variable nodes are connected to 
each other through a common degree 2 check node and thus disconnected from the remainder 
of nodes in G. Thus, in T(G), the corresponding supernode is also disconnected from the rest 
of T(G) and thus peeled in the first round, i.e. t*(«) = 1. 

• Inductive step: Assume that for all v G ^ such that max,gj-i(^) t{j) < t, then t*(t') < t 
and suppose there exists i £ V such that maxjgj-i(j) t{j) = t + 1, but t*(z) > t + 1. We note 
that when performing peeling on G, without loss of generality, we can always assume that 
for every supernode i^ £ V^, exactly one element i £ T~^(i) will be peeled last. Consider the 
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alternative, where there exist ii,i2 G i*, ii 7^ ^2 such that t{ii) = t{i2) = t + 1. Since ii and 
^2 are connected by a sequence of degree 2 check nodes, they must be neighbors of a common 
degree 2 check node in G, as there is no way to peel that check node before peeling either 
ii or 12- Thus, immediately prior to being peeled, as they both must have degree 1, ii and 
i2 are disconnected from the rest of G, and their peeling has no effect on the remainder of 
the peeling process for J*(G). Thus, we can assume L = {i ^V : I ^ argmaxjgj-i(j) t{j)} is 
comprised of a single element, ^, with t(£) = t + 1. 

Since £ is peeled by J*"*"-*^ in G, |5£(J*(G))| = 1. Let h = 9£(J*(G)) be the neighboring check 
node. In particular, for any j G T^^(i) \ {^}, all a G dj{G) are peeled in G by round t. 
Hence, by the inductive hypothesis, all a G 5i*(T(G)) except possibly b are peeled in T(G) 
by round t. Therefore, \di^:{V{J{G)))\ < 1 for some t < t and i^, will be peeled on or before 
J*"'"^(T(G)), i.e. t^:{i) < t + 1. This is a contradiction and the result follows. 

Having proved t^{i) < maXjgj-i(j) t{j), we conclude that J*(T(G)) C T(J*(G)). This along with 
the observation that peeling preserves the partial ordering on graphs as defined before finishes the 
proof of the lemma. D 

Peelability of a pair (q, R) immediately implies some useful properties. 

Lemma 5.2. For a factor degree profile (a, R) that is peelahle at rate ij > 0, we have: 

(i) 2aR2 <l-v- 

(ii) Q < 1. 

Notice that the factor graph induced by degree 2 check nodes is in natural correspondence with 
an ordinary graph (replace every check node by an edge) which is uniformly random given the 
number of edges. The average degree of this graph is 2aR2, and Lemma 15.21 (i) implies that it is 
subcritical, as we would expect for a peelable degree distribution. 

Lemma 13.111 is stated for the ensemble 0{n,R,m), m = na. However, in parts of the proof 
of this lemma, we find it convenient to work instead with the ensemble C{n,R,m) introduced in 
Section gH 

We need to characterize the residual graph Jt after t rounds of peeling. Lemmas 15.31 and 14.71 
achieves this for G ~ C{n,R,m). Together, they show essentially that density evolution provides 
an accurate characterization of Jt- Using these Lemmas, we are able to deduce (see proof of Lemma 
13.111 (i) and (ii) below) that Jr consists of small trees and unicyclic components w.h.p. , for large 
enough r. Finally, using Lemma 14.41 we apply the same results to G ~ D(n, R, m). 

Recall that ni(G) denotes the number of variable nodes of degree 1 in G, and 77-2+ (G) denotes 
the number of variable nodes of degree 2 or more in G. Let 

C(n,i?,m;n;,n'2) = {G : G e C{n,R,m),ni{G) = n[, n2+(G) = n'^} . (32) 

In the lemma below, we slightly modify the peeling process, choosing to retain all variable nodes 
V in the residual graph (check nodes are eliminated as usual). With a slight abuse of notation, 
we keep denoting by Jt the residual graph, although this is obtained from Jt by adding a certain 
number of isolated variable nodes. 

Lemma 5.3. Consider a graph G drawn uniformly at random from C{n,R,m). For any i G N, 
consider synchronous peeling for t rounds on G, resulting in the residual graph Jt- Suppose that for 
some {R, 171,711,112), we have Jt G C{n,R,ih;hi,h2) with positive probability. Then, conditioned on 
Jt G C{n,R,rh;hi,h2), the residual graph Jt is uniformly random within C{n, R,rh;hi,h2). 
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Our final technical lemma bounds the number of peeling rounds needed to annihilate a tree or 
unicyclic component. 

Lemma 5.4. Consider a factor graph G = {F, V, E) with no check nodes of degree 1 or 2, that is 
a tree or unicyclic. Then G is peelable and Tc{G) < 2[log2 1^11 • 

Proof of Lemma \3.11\ (i) and {ii). A standard calculation (see e.g. [DM08J . or Section [7.11 which 
carries through a similar calculation) shows that, for a uniformly random graph C(n, R, rh; ni, 77,2), 
with ni,n2 > ne and with mR'{l) > hi+2h2+ne for some e > 0, the asymptotic degree distribution 
of variable nodes is 

F{D = 0} = qo, 

F{D = i} = {l-qo- Qi) P{Poisson>2(A) = £} , for alli>2. 

for suitable choices of qq, qi, A depending on the ensemble parameters. Further, by a standard 
breadth-first search argument, the neighborhood of a vertex v is dominated stochastically by a 
(bipartite) Galton- Watson tree, with offspring distribution equal to the size-biased version of R at 
check nodes, and equal to of ¥{D = • } at variable nodes. 

Consider G ~ C{n,R,m). Using Lemma l47l and [5^ it is possible to estimate the degree distri- 
bution, of J(. A lengthy but straightforward calculation shows that the corresponding branching 
factor is 9{Jt) = aR'{zt). Now, notice that 

k 

R'{z) = 2R2 + Y^ l{l - l)z^-'^ < 2R2 + k{k - l)z 
1=3 

for z < 1. Choose r = T{r], k) < 00 such that Zr < ri/{3ak{k — 1)). Then we have aR'{l)p'{zr) < 
2aR2 + rj/3. But Lemma O tells us that 2ai?2 < 1 - r/. It follows that aR'{zr) < 1 - 2??/3. 

In particular, the branching factor 6 = 9{Jr) associated with the random graph J^ satisfies, 
9 < 1 — r//3, with probability at least 1 — 1/n^. Following a standard argument [BolOlj where 
we explore the neighborhood of v by breadth first search, we obtain that with probability at least 
1 — 1/n^''^ for n > Ni{r],k), the connected component containing f is a tree or unicyclic, with 
size less than C4logn, for some C4 = C4{rj,k) < (X). Applying a union bound we obtain that for 
n > N2 = N2{rj,k), with probability at least 1/n^''^, the event E occurs, where 

E = {All connected components in J,- are trees or unicyclic and have size at most C4logn. } 

(33) 

Then, from Lemma 14.41 we infer that E occurs with probability at least 1/n^'^ for G '^ 0{n,R,m) 
provided n > N^, where A^3 = N^ik) < 00. We stick to G ~ D(n, R, m) for the rest of this proof. 

Let us consider first point (i). Clearly, tree components are peelable. If R2 = 0, then there 
are no factors of degree 2, and unicyclic components are also peelable (Lemma 15. 4p . Thus, the 
entire graph is annihilated by peeling w.h.p. , as claimed. If R2 > 0, then the number of unicyclic 
components of size smaller than M is asymptotically Poisson with parameter C5 < 00 uniformly 
bounded in M (this follows e.g. by |Wor81j . see also |Wor99irBol01| ). It follows that with probability 
at least exp(— C5)/2 for n > A^4, there are no unicyclic components of size smaller than M. The 
expected number of unicyclic components of size M or larger is upper bounded by '}2i>Ai 9^/{2i) < 
9 /{I — 9), and for M large enough no unicyclic component of this sizes exists, with probability 
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at least 1 — exp(— C5)/4. Considering these two contributions, the graph contains no cycle with 
probability at least exp(— C5)/4 for n > N^, and hence it is peelable. This completes part (i). 

For (a), notice that in collapsing a connected component of J-r, the number of variable nodes 
does not increase. Further, a tree component collapses to a tree and a unicyclic component collapses 
either to a tree or a unicyclic components. Thus, we can use Lemma 15.41 with N < C4logn to 
obtain the a bound of Ci log log n on the number of peeling rounds needed, with probability at least 
1 — 1/nP'^. Since the probability of peelability is uniformly bounded away from zero as n — )• oo, the 
probability that the same bound on the number of peeling rounds holds conditioned on peelability 
is at least (for some 6 > 0) 1 — l/{dnP'^) > 1 — l/nP'^ for n > N^, as required. D 

5.2 Proof of Lemma 13.111 (Hi) 

The following lemma bounds the size of a supercritical Galton- Watson tree, observed up to finite 
depth. The proof is in Appendix iBl 

Lemma 5.5. Consider a Galton-Watson branching process {Z(}^q with Zq = 1 and with offspring 
distribution P{Zi = j} = bj, j > 0. Suppose br < {1 — 5Y /5 for all r > 0, for some 5 > 0. 
Also, assume that the branching factor satisfies 9 = Yl'T=iJ^j — ^[^i] > 1- Then, there exists 
C = C{6) > such that the following happens. 
For any /3 > 3 and T G N, we have 



E ^* > (/5^)' 



<2exp{-C{P/3y). (34) 



Proof of Lemma \3.11\ [Hi). From Lemma 15.21 [ii), we know that a < 1. The following occurs in 
the collapse process: Let G^"^' = {F^"^' ,V,E^'^>) be the subgraph of G induced by the degree 2 
factor nodes (with isolated vertices retained). We have F* = F\F^'^' . All variable nodes that 
belong to a single connected component of G^^' coalesce into a single super-node v' G 14 in G*, 
with a neighborhood that consists of the union of the individual neighborhoods restricted to F* 
(cf. Definition 13. 3p . As mentioned above, G^"^' is a random factor graph with aR2n factor nodes 
of degree 2, and is in one-to-one correspondence with a uniformly random graph. For v' G V^,, we 
denote by S{v') the number of variable nodes in V in the component v'. Lemma I5.2l fi) implies 
that the branching factor of G^'^' obeys 2ai?2 < 1 — "^j i-e., G^'^' is subcritical. This leads to 
the following claim, that follows immediately from a well known result on the size of the largest 
connected component in a subcritical random graph |Bol01| . 

Claim 1: There exists C2 = C2{rj) < 00, A'^2 = A^2(^) < 00 such that the following occurs 
for all n > A'^2- No component t;' G 14 is composed of more than C2logn variable nodes, i.e. 
max^/gv, S{v') < C2logn, with probability at least 1 — 1/n. 

Let G~^ = {F:f,V,E\E^'^>), i.e., G~^ is the subgraph of G induced by factors of degree greater 
than 2 (with isolated vertices retained). 

From Poisson estimates on the node degree distribution, we get the following. 
Claim 2: There exists C3 = C^{r], k) < 00, N^ = N^^i], k) < 00 such that the following occurs. For 
all n > A'^s, no variable node v £ V has degree larger than C3 log n in G~^, i.e., degQ~2 (v) < C3 log n 
for all V £ V, with probability at least 1 — 1/n. 

Note that we used a < 1 (from Lemma 15.21 (i)) to avoid dependence on a in the above claim. 

Let 

E = {S{v') < Calogn for all v' £ V,} n {degG~2(?;) < Cglogn for all v £ V} . 
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Claims 1 and 2 above imply that E holds with probability at least 1 — 2/n for n > A^4, for some 
N4 = N4{r],k) < 00. 

Clearly, G~^ is independent of G^"^' . In particular, for w G 1/ that is part of supernode v' G K, 
we know that |5'(i'')| is independent of G~^. There is a slight dependence between the degree of 
different variable nodes, but assuming E, the effect of this is small if we only condition on polylog(n) 
nodes in G* . This enables our bound on the size of balls in G* . 

Recall that the distribution of random variable Xi is dominated by the distribution of X2, if 
there exists a coupling between Xi and X2 such that Xi < X2 with probability 1. In bounding 
the size of a ball of radius Tub, we are justified in replacing degree distributions by dominating 
distributions, and in assuming that there are no loops. 

Fixing a vertex ii G V*, we construct the ball Bc(f,Tub) sequentially through a breadth- first 
search. Choose e = r]/2. For n large enough, the distribution of |5'(i'')| is dominated by the distribu- 
tion of the number of nodes in a Calton- Watson tree with offspring distribution Poisson(2ai?2 + &)• 
The distribution of degQ~2(f) is dominated by Poisson(a(^;^3 /i?;) +e). In particular, the degree 
distribution of G* is dominated by a geometric distribution hr < (1 — Sy /5 for some 5 = 5{r], k) > 0. 
Assuming E, this also holds conditionally on the nodes revealed so far, as long as the number of 
these is, say, polylog(n). 

Thus, assuming E, the number of nodes in a ball of radius Tub = Ci log log n is dominated by the 
number of nodes in a Calton- Watson tree of depth Tub with offspring distribution (6r)(f satisfying 
br < (1 — SY/6 for some and 6 = YlTLiJ^j < ^5' f°^ ^ — -^5- We deduce from Lemma [53] that 



max |BG.K,Tub)| < (logn) 



Cb 



> 1 - 1/n 



(35) 



for some Gg = CQ{r],k) < 00, where \Bg,{v' ,T^h)\ denotes the number of super-nodes in Bg, (ti',Tub). 
But given E, the size of components v' £ V^, is uniformly bounded by G2logn. Thus, conditioned 
on E, we have max^j/gv, S{v',Tuh) < G2(logn)'^'5"^^ with probability at least 1 — 1/n. At this point, 
we recall that P[E] > 1 — 2/n, and the result follows. D 



6 Proof of Lemmas 13.91 and I3.12t Characterizing the periphery 

In this section we characterize the periphery of a factor graph G when it has a non-trivial 2-core. 
Recall the definitions of the 2-core, backbone and periphery of a graph from Section [3.21 First, we 
note some of the properties of these subgraphs that will be useful in the proof of the main lemmas 
of this section. 

As a matter of notation, for a bipartite graph G chosen uniformly at random from the set 
G(n, k, m) we denote by Gp the periphery of G and by Gp (lower case subscript) a subgraph of G 
that is a potential candidate for being the periphery of G. Similarly, we denote by Gb the backbone 
of G and by Gb a subgraph of G that is a potential candidate for being the backbone of G. 



6.1 Proof of Lemma 13. 9t Periphery is Conditionally a Uniform Random Graph 

Lemma [3. 91 states that if we fix the number of nodes and the check degree profile of the periphery of 
a graph G chosen uniformly at random from the set G(n, k, m) then the periphery, Gp, is distributed 
uniformly at random conditioned on being peelable. Since the original graph G is chosen uniformly 
at random, in order to prove this lemma it is enough to count, for each possible choice of the 
periphery Gp, the number of graphs G that have the periphery Gp. 
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Before proving Lemma 13.91 we first introduce the concept of a 'rigid' graph and estabhsh a 
monotonicity property for the backbone augmentation procedure which was defined in Section [3.21 
We use the notation G C G' if G is a subgraph of G' . 

Lemma 6.1. Let G = {F,V,E) be a bipartite graph and let Gg be the subgraph of G induced by 
some Fs Q F. Let Fi and Fu be subsets of F such that F\ C Fu and F\ C Fg. Let B^ be the 
subgraph induced by Fi (so B^ C Gs) and let Bu be the subgraph induced by F^- Denote by Bj 
the output of the backbone augmentation process on Gs with the initial graph By and by Bu the 
output of the backbone augmentation process on G with the initial graph Bs ■ Then, i?j C B{^ ■ 

The proof of Lemma |6. II can be found in Appendix O 

Definition 6.2. Define a graph to be rigid if its backbone is the whole graph. We denote by 
TZ{n, k, m) the class of rigid graphs with n variable nodes, and m check nodes each of degree k. 

Lemma 6.3. Consider a bipartite graph G = {F,V,E) from the ensemble G{n,k,m). For some 
set of check nodes F\^ O F denote by Gb = (Fb, Vb,Fb) the subgraph induced by Fb, and denote by 
Gp = (Fp, Vp, Fp) the subgraph of G induced by the pair (Fp = F\Fb, Vp = y\Vb)- Assume Gb and 
Gp satisfy the following conditions: 

• Gp is peelable, 

• Gb is rigid, 

• \da\ > 2, Va E Fp. 

Then Gb is the backbone of G (and Gp is the periphery). 

Proof. If Gb is empty the lemma is trivially true. Assume Gb is nonempty. We prove this lemma 
in two steps. In the first step we prove that Gb is a subgraph of Gb, the backbone of G. In the 
second step we show that Gb cannot contain anything outside Gb. 

Since Gb is rigid, it contains a non-empty 2-core (Gb)c and the output of the backbone augmen- 
tation procedure with initial graph (Gb)c is Gb itself. Furthermore, (Gb)c is part of Gc, the 2-core 
of the original graph G, since by definition a 2-core is the maximal stopping set (cf. Definition 12. 2 p 
and (Gb)c is a stopping set in G. Hence, the monotonicity of the backbone augmentation procedure 
implies that Gb C Gb. 

In the second step, we prove that Gb cannot contain any node outside Gb. First note that 
Gp cannot contain any check node from the 2-core of the original graph G. We prove this by 
contradiction. Suppose instead that F is the nonempty set of all the check nodes from the 2-core 
of G that are in Gp. Let V be the set of neighbors of F in Gp. The nodes in V are also part of the 
2-core of G and have degree at least 2 in the 2-core of G. Furthermore, there is no edge incident 
from Fb to Vp because, by definition, Gb is check- induced. In particular, in the 2-core of G, there is 
no other edge incident on variables in V beyond the ones coming from F. Hence, in the non-empty 
subgraph G C Gp induced by the check nodes in F and all their neighbors every variable node has 
degree at least 2. This subgraph is then, by definition, a stopping set in Gp. But by assumption Gp 
is peelable and cannot contain a stopping set. This is a contradiction that rules out the existence 
of a nonempty set F. Hence, the 2-core of G is contained entirely in Gb (recall that both Gb and 
the 2-core are check- induced) . 

Let B^ '^' and B^^' be the output of the backbone augmentation procedure on G, once with 
initial subgraph given by the 2-core of G and once with the initial subgraph given by Gb (which 
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contains the 2-core of G). By nionotonicity, B^^> C B^^'. But the process with the initial 
subgraph Gb terminates immediately since, by assumption, all check node outside Gb have at lease 
two neighbors in Gp. Therefore, Gb = B^^'^' C B^'~^^' = Gb- This finishes our proof. D 

It is easy to see that the converse of Lemma [63] is also true, as stated below. 

Remark 6.4. IfG\^ = Gb is the backbone ofG, then the subgraphs Gb and Gp = G\G\y = Gp satisfy 
the condition of Lemma 1 6. 3[ Here G\Gb denotes the subgraph of G induced by {F\F\^,V\Vh). 

We omit a formal proof of Remark 16.41 However, notice that the fact that the graph G\Gb is 
peelable follows from the connection between the peeling algorithm and BPq stated in Lemmas 14.21 
and 14.31 We stated that the messages coming out of the backbone are always 0. From the check 
node update rule, an incoming message to a check node can be dropped without changing any of 
the outgoing messages as long as there is at least one other incoming message. By definition there 
is no edge between variable nodes in the periphery and check nodes in the backbone. Furthermore, 
all the check nodes in the periphery have at least two neighbors in the periphery. Therefore, BPq on 
the periphery has the same messages as the corresponding messages of BPq on the whole graph. In 
particular, the fixed point of BPq on the periphery is all * messages which shows that the periphery 
subgraph is peelable. 

We now prove Lemma [ 



Proof of Lemma \3.9[ . Our goal is to characterize the probability of observing the periphery of G 
to be Gp = {Fp,Vp, Ep). We use the short hand notation G\Gp to denote the subgraph of G 
induced by the check-variable nodes pair {F\Fp,V\Vp). Let Gb = {F\Fp,V\Vp,E^) = G\Gp and 
Eph = {{i,a)\i G ^\Vp, a G Fp} be a set of edges that satisfy the condition deg^; (a)+deg£; ^{a) = k 
for all a £ Fp. As before, we denote by Gb and Gp the actual periphery and backbone of the graph 
G. Define the set of rigid graphs on nb variable nodes, rrib check nodes and check degree k, 
TZ{nh, k,mb), as 



7^(nb, k, mb) = {Gb = (Fb, Vh, Eb) : |Fb| = my,, 14 = n^, \da\ = kWae Fb, Gb is rigid} (36) 

By Lemma 16.31 

{G € G{n,k,m) : Gp = Gp, Gb = Gb} 

= {G G G(n, k, m) : GpC G, G\Gp = Gb, Gp G P, Gb G 7^}, (37) 

and in particular, 

{G G G(n, k, m) : Gp = Gp} = {G e G(n, k, m) : Gp C G, Gp G V, G\Gp G 7^}. (38) 

From Eq. ([38l) . and counting all the choices for the subgraph Gb = G\Gp, and the edges that 
connect Gp and Gb, 

|{GGG(n,A;,m) : Gp = Gp}| 

= IZ2^I{^^^(^'^'"') • GpCG,GpGP,G\Gp = Gb,GbG7^, F\(FpUFb) = Fpb}|. (39) 

Gb £^pb 
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For fixed Gp and Gb, we can count the number of ways these two subgraphs can be connected to 
each other. Letting R be the degree profile of Gp, we have 

\{GeG{n,k,m) : Gp = Gp}| 

G\Gpl=2 ^ ^ 

We can rewrite this as, 

|{GGG(n,A:,m) : Gp = Gp}| 



— ' n — Vv I \ ' '^' 



^^ 



= n fc_/ \nn-\VM^rn-\Fp\mGp^V). (41) 

It is clear that the cardinality of the set lZ{n\^, k, rrib) is a function of only nb and m\y. Hence, 

\{G E G(n, A;, m) : Gp = Gp}| = Z(np, A;, R^, mp)I(Gp G P), (42) 

for some function Z( •,-,•,• ). Since the graph G itself was chosen uniformly at random from the 
set G{n,k,m), this shows that conditioned on {np,R^,mp), all graphs Gp G V with Up variable 
nodes, rup check nodes, and check degree profile i?P are equally likely to be observed. D 

6.2 Proof of Lemma I3.12t Periphery is Exponentially Peelable 

Let G = {F, V, E) be a graph drawn uniformly at random from G(n, k, an), and let Gp = (Fp, Fp, E-p) 
be its periphery. Recall the connection between BPq and the peeling algorithm from Section HI Let 
Q be defined as in Theorem [H i.e., Q is the largest positive solution of Q = 1 — exp{— fcaQ'^"^}. 
In light of Lemma [49t we define the asymptotic degree profile pair of the periphery, {a,R{x)) as 
follows (recall that, from Lemma 14.31 the periphery does include check nodes receiving at most 
k — 2 messages of type 0). 

Definition 6.5. 

«(-) - i-Q.--,(l-0)Q>- •§(')''- ''''''"'^■' ■ <''> 

S . a (^^-Q--H^-Q)Q'-y (44, 

Unlike the backbone where all check nodes are of degree k, the periphery can have check nodes 
of degrees between 2 and k. Among these, check nodes of degree 2 are of importance to us since 
they can potentially form long strings. Strings are particularly unfriendly structures for the peeling 
algorithm; peeling takes linear time to peel such structures. In the next lemma, we define a 
parameter as a function of Q, which is the estimated branching factor of the subgraph of the 
periphery induced by check nodes of degree 2. Lemma 16.61 proves that this branching factor is less 
than one for all a G (a(i{k), 1]. 

Lemma 6.6. Let 6 = ak{k — 1)(1 — Q)Q^~'^ with Q as defined in TheoremUl Then 6 < 1 for all 
a G {a^{k), 1]. 

Proof of this lemma can be found in the Appendix [Cl 
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Lemma 6.7. Let Q be defined as in TheoremUl Then the pair {a,R) defined in Deftnition \6.5\ is 
peelable at rate rji for some rji = rji{a,k) > 0. In particular, < f{z,a,R) < (1 — rji)z for all 
ZG(0,1]. 

Proof. In view of the density evolution recursion (Definition [9]) , define 

f{z) = 1 - exp{-aR'{z)) . 

We prove the lemma by showing that /'(O) = 6 < 1 and that f{z) < z strictly for z E (0, 1]. 
Using the definitions of a. and R{z), the function f(z) can be written as 

f{z) = 1 - exp (-afc ((Q + (1 - Q)z)''-' - Q^-^)) . (45) 

By a straightforward calculation, and using Lemma 16.61 we get 

/'(O) = aR'{0) exp(-ai?'(0)) = ak{k - 1)(1 - Q)Q^'^ = 6 <l. (46) 

Assume < y < 1 to be fixed point of /, i.e., 

y = 1 _ exp (-afc ((Q + (1 - Q)y)'=-i - Q'^-i)) . (47) 

Using the identity Q = I — exjp{—akQ^~^) and after some calculation, we get 

Q + {1-Q)y = l- exp (-ak{Q + (1 - Q)y)'''') . (48) 



Equation ([^8]) shows that Q + (1 — Q)y is a fixed point of the original density evolution recursion 
([9]) with R{x) = X . Since, by definition, Q is the largest fixed point of that recursion, y = is 
the only fixed point of f{z) = 1 — exp(— q;^'(z)) in the interval [0, 1]. Since /'(O) < 1, we have 
f{z) < z for all z G (0, 1], and therefore f{z)/z < 1 for all z S [0, 1]. The claim follows by taking 
rj = 1 — sup^grg^i] f{z)/z, with r/i > by continuity of z i— )• f{z)/z over the compact [0, 1]. D 

We can now prove Lemma 13.121 

Proof of Lemma \3.1SX For any e > 0, by Lemmas 14.31 and 14.91 we know that 

|ap — a\ < £ , 
\R\-Ri\<£ for /e{2,...,A:}, (49) 

hold w.h.p. 

Using i?Q = R\ we obtain that the function f{z, a, R)/ z is an analytic function over set [0, 1]^^. 
By Lemma W7\ f{z,a,R)/z < 1 — ryi- It follows that, for e small enough, f'{z,a,R) < 1 — (r/i/2), 
and hence the periphery is w.h.p. peelable at rate ij = r/i/2. This proves part (i). Part (ii) follows 
immediately from Lemma 14.91 D 

7 Proof of Lemma 13.51 

We find it convenient to work within the configuration model: we assume here that G is drawn 
uniformly at random from C{n,k,m). The following fact is an immediate consequence of Lemma 
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Fact 7.1. AssuTTie G is drawn uniformly at random from C{n,k,m), and denote by nc,mc the 
number of variable and check nodes in the core of G. Suppose (nc = nc,m-c = t^c) occurs with 
positive probability. Then conditioned on (n^ = n^^mc = rUc), the core is drawn uniformly from 
C{nc,k,mc',0,nc) (recall the definition of this ensemble in Eq. ()32p ). 

In words, the core is drawn uniformly from C(nc, k, mc) conditioned on all variable nodes having 
degree 2 or more. 

Now, it has been proved |DM08j that, w.h.p. 

\nc/n - (1 - exp(-aA;Q)(l + akQ))\ = o(l) , (50) 

\mc/n - aQ^\ = o{l) , (51) 



where {Q,Q) is as defined in Theorem [TJ The above bounds also follow from Lemmas 14.31 and 14.61 
The kernel of the core system 5c contains all vectors x with the following property. Let Vn) C Vc 
be the subset of variables taking value 1 in x (i.e. the support of x). Then the subgraph of Gc 
induced by V(i) has no check node with odd degree. 

We will refer to such subgraphs as to even subgraphs. Explicitly, even subgraphs are variable- 
induced subgraphs such that no check node has odd degree. We want characterize the even sub- 
graphs of Gc having no more than ne variable nodes, in terms of their size and number. Lemma [7.41 
in subsection 17.11 below allows us to do this provided certain conditions are met. Our next lemma 
tells us that the core meets these conditions w.h.p. . 

Lemma 7.2. Fix k and consider any a G (ad(A;),as(A;)). There exists 6 = 5{a,k) > such that 
the following happens. Let G be drawn uniformly from C(n, k, an). Let Uc be the (random) number 
of variable nodes in the core, mc be the number of check nodes in the core and uq = mc/nc. Let rjc 
be the unique positive solution of 

"'""-'Uo,, (52) 

and let 620 = Vcik — l)/{e^'^ — 1). For any 6' > 0, we have, w.h.p. : 
(i) ^2c < 1 - S. 
(ii) ace[2/k + 6,l]■ 
{i^i) nc/n > (1 — exp(— q/cQ)(1 + akQ)) — 5' . 

The discussion in subsection 17.11 throws light on the definitions of r]c and 62c used. 

Proof of Lemma 1.2. From Eqs. (I50p . (I5ip . we deduce that t/c = akQ + o(l) w.h.p. , leading to 



72C 



ak{k - 1)Q*^-2(1 - Q) + 0(1) < 1 - 5 



for sufficiently small b, using Lemma 16.61 Thus, we have established point {%). 

Point (iii) and the lower bound in point (ii) are easy consequences of Eqs. ([50]) . ([5T|) . The 
upper bound in point {ii), ac<\ w.h.p. , follows directly from the fact that for a < as, the system 
Mx = b has a solution for all b G {0, l}*" w.h.p. . D 

Proof of Lemma \3.5[ Consider first G ~ C{n,k,m). Applying Fact 17.1 1 and Lemma [7.21 we deduce 
that, conditional on the number of nodes, the core is Gc ~ C{nc,k,mc',0,nc) and satisfies the 
conditions of Lemma l7.4l proved below. By Lemma[731 the elements of Cc{£n) are in correspondence 
with simple loops in the subgraph of Gc induced by degree-2 variable nodes. The sparsity bounds 
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follows from Lemma |7.4[ The clam that they are, with high probability, disjoint, follows instead 
from the fact that this random subgraph is subcritical (since 2ai?2 < 1) and hence decomposes in 
trees and unicyclic components. 

Using Lemma HiU we deduce that the result holds also for the G ~ G(n, k, m) as required. D 

7.1 Characterizing even subgraphs of the core 

This section aims at characterizing the small even subgraphs of the core Gc- For the sake of 
simplicity, we shall drop the subscript c throughout the subsection. 
Fix k. Consider some a > 2/k. Let ry,,, > be defined implicitly by 

HA^^H^^l = ak (53) 

For a G {2/k,oo), we have ??*(a) > and tj^ is an increasing function of a at fixed k [DM08] . 

Consider a graph G = {F,V,E) drawn uniformly at random from C(n, /c,an;0, n). The ra- 
tional for this definition of ij^, is that the asymptotic degree distribution of variable nodes in G 
is Poisson(?7*) conditioned on the outcome being greater than or equal to 2 (to be denoted below 
Poisson>2 (??*)). 

We are interested in even subgraphs of G. 

Consider the subgraph G2 = {FjV^'^' ,E'''^') of G induced by variable nodes of degree 2 (with 
all factor nodes retained). The asymptotic branching factor this subgraph turns out to be 62 = 
r]^:{k — \)/{e^* — 1). We impose the condition O2 <1 — 8 for some 5 > (since this is true of the 
core). Note that O2 is a decreasing function of ??*, and hence a decreasing function of a, for fixed k. 

First we state a technical lemma that we find useful. 

Lemma 7.3. Consider any k, any a € {2/k, 1] and e G (0, 1]. Then there exists Nq = NQ{k,e) < 00 
and C = G{k) < 00 such that the following occurs for all n > Nq. Consider a graph G = {F, V, E) 
drawn uniformly at random from C{n,k,m;0,n), m = na. With probability at least 1 — 1/n, there 
is no subset of variable nodes V' <Z V such that \V'\ < en and the sum of the degrees of nodes in 
V exceeds Celog(l/e)n. 

Proof. Let deg(i) be the degree of variable node i £ V. Let Xi ^ Poisson>2(r?*) be i.i.d. for 
i £ V. Then (deg(i))"^;^ is distributed as {Xi)^^^, conditioned on Y17=i-^i ~ ™'^- Consider 
y = {1,2,...,/}. We have 



5^ deg(i) > 7/} = P{ J^ Xi > 7/ 1 ^ Xi = mA:} 

-HZ^=iX, = mk} 

Now, nE[Xj] = nak = mk, by our choice of r/* in Eq. ([53]) . Since a < 1, we deduce that 
V* ^ Ci = Ci{k) < 00. Using a local central limit theorem (CLT) for lattice random variables 
(Theorem 5.4 of |Hal820 we obtain P{ ELi ^^ = "^^1 > C2n-i/2 for some C2 = C2{k) > 0. 
A standard Chernoff bound yields P{ Yl\=i^i > 7^} < exp { - l^G^}, for some C^ik) G (0,1], 
provided 7 > 2ak. Thus we obtain 

I 
P{ Y, deg(i) > 7^ < ™'^' exp { - /7C3}/C2 , (54) 
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provided 7 > 2ak. We use 7 = C"(l + log(l/e)) with C" = 2ak/C^. Take / = en. The number of 
different subsets of variable nodes of size / is (") < (e/e)' for n > Ni for some A'^i = Ni{e) < 00. A 
union bound gives the desired result. D 

Lemma 7.4. Fix k > 3, and 5 > so that for any a £ [2/k + 6,1], we have 92{a, k) < 1 — 6. 
Then, for any 5' > 0, there exists e = e{5, k) > 0, C = C{S, 6' , k) < 00 and Nq = Nq{5, 6' , k) < 00 
such that the following occurs for every n > Nq. Consider a graph G = {F,V,E) drawn uniformly 
at random from C(n, k, an; 0, n). With probability at least 1 — 6' , both the following hold: 
(i) Consider minimal even subgraphs consisting of only degree 2 variable nodes. There are no more 
than C such subgraphs. Each of them is a simple cycle consisting of no more than C variable nodes, 
(ii) Every even subgraph ofG with less than en variable nodes contains only degree 2 variable nodes. 

Proof. Part {i) : Reveal the mk edges of G sequentially. The expected number of nodes in V^'^' , con- 
ditioned on the first t edges revealed forms a martingale with differences bounded by 2. Then, from 
Azuma-Hoeffding inequality |DP09) . we deduce that |y(^'| concentrates around its expectation: 



||y(2)| _]E[|y(2)|]| > (^^) < exp(-CiC 



^2\ 



for all C > 0, where Ci = Gi{k) > 0. The expectation can be computed for instance using the 
Poisson representation as in the proof of Lemma ESj yielding \K\V^'^'\—nr]l/{2{e'^* — 1— ?7^,))| < n^'^, 
for all a < 1, n > No{k). We deduce that for any 61 = 6i{6, k) > 0, we have 

F(\\V^'^^\/n-ril/{2{e''' -l-r]^))\>5in) < 1/n (55) 

for all n> Ni, where iVi = Ni{5, k) < 00. 

Now, condition on \V^'^'\ = n'^', for some n^"^' such that 

|n(2)/n-r/2/(2(e''* -l-r?,))| < 6in . (56) 

Note that by choosing 61 small enough, we can ensure n^'^' = Q{n). We are now interested in 
the check degree distribution R^"^' in G2. Reveal the 2n^'^' edges of G2 sequentially. Consider 
/ G {0, 1, . . . , k}. The expected number of check nodes with degree I in G2, conditioned on the edges 
revealed thus far, forms a martingale with differences bounded by 2. Let Z ~ Binom(A;, 2n^'^'/{mk)). 

(2) 

We have K[R^ '] = ¥{Z = I) -\- 0{l/n). Arguing as above for each I < k, we finally obtain, 

k 

Y^ |pf ^ -¥{Z = l)\> 6in] < 1/n (57) 

1=0 

for all n > N2, where A'^2 = ^2(5, k) < 00. 

Now condition on both n^"^' satisfying Eq. (|56p and R^'^' satisfying 



Y,\R?^-nz = i)\<6, 



?(2) 
/=0 



Let C be the branching factor of G2 (i.e. of a graph that is uniformly random conditional on the 
degree profile R^'^'). Under the above conditions on n^'^' and R^'^', a straightforward calculation 
implies that C is bounded above by O2 + 82, for some 62 = ^2(61, k) such that (52 — )• as 5i — )• 0. 
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Thus, by selecting appropriately small 6i, we can ensure that 62 < S/2, leading to a bound of 
1 — 6/2 on the branching factor for all n*^^\ i?*^^-* within the range specified above. 

Now we further condition on the degree sequence, i.e., the sequence of check node degrees in 
G2. The factor graph G2 can be naturally associated to a graph, by replacing each variable node 
by an edge and each check node by a vertex. This graph is distributed according to the standard 
(non-bipartite) configuration model. Using |Wor8H Theorem 4], we obtain that the number of 
cycles of length / S {1, 2, . . . , /q} for a constant Iq are asymptotically independent Poisson random 
variables, with parameterqjl 

k k 

\i = C'/{2l) ioTC=[Y,d{d-^)R^^\d)]/[Y^dR^^\d) 

d=l d=l 

More precisely, for any constants ci, C2, . . . , Qq G AA U {0}, we have 

I 

A^{c)] = JJP(Poisson(Az)) = q) + o(l) , 
1=1 

where E is the event that there are ci cycles of length I for / € {1, 2, . . . , /q} with all cycles disjoint 
from each other, and c = (q)j^]^. Choosing Iq large enough, we have 

00 
Y, P[E(c)] > 1 - exp ( - ^ A,) - 5/4 = 1 - (1 - C)-'/' - 5' /A , 

C(^J\f 1 = 1 

where M = {c : c^ 0, ci < Iq ior I £ {1,2, ... , Iq}}, for n large enough. 

On the other hand, we know that the probability of having no cycles in G2 is (1 — C)^^ + o(l) 
under our assumption of C ^ 1 — S/2. The argument for this was already outlined in the proof 
of Lemma 13.111 cf. Section I5.lt the Poisson approximation of [Wor81] is used to estimate the 
probability of having no cycles of length smaller than M, while a simple first moment bound is 
sufficient for cycles of length M or larger. Thus, with probability at least 1 — 6' /3, we have no more 
than Iq cycles, disjoint and each of length no more than Iq. Choosing C = Iq, we obtain part (i) 
with probability at least 1 — 6' /2 for large enough n. 

Part (ii): Let m = an. Let J\f{G;l,j) be the number of even subgraphs of G induced by I 
variable nodes such that the sum of the degrees of the I variable nodes is 2(/ +j). We are interested 
in / < en (we will choose e later) and j > 0. In particular, we want to show that, for any 6' > 0, 

sn mk/2 

KX; E AA(G;/,j)>0} <5'/2. (58) 

1=1 j=i 

This immediately implies the desired result from linearity of expectation and Markov inequality. 
From Lemma 17.31 we deduce that 

en mk/2 

^{E E AA(G;Z,i)>0} < 1/n, (59) 

(=1 j=e'n 



*The model in 'WorSl is slightly different from the configuration model for its treatment of self loops and double 
edges. However, the results and proof can be adapted to the configuration model. 
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for some e'{e, k) with the property that e' — )• as e — )• 0. Thus, we only need to estabhsh 



1=1 j=i 

for all n large enough, since the claim then follows from Markov inequality. 
A straightforward calculation |RU08t IMM09] yields 



(60) 



E[Af{G;l,j)] 



OriT2% 

I mk \ i-r- 



where 

7i = coeff 

72 = coeff 
Ts = coeff 
74 = coeff 






y 



2{l+j) 



{ey-l-vT-y^^ 



It is useful to recall the following probabilistic representation of combinatorial coefficients. 
Fact 7.5. For any r] > 0, we have 

N 



coeff [(e^ -l-yf;y^^]= r]-^^ {e'^ - 1 - r/)^P ^X, 



M 



i=l 



(61) 



where Xi ~ Poisson>2(?7) are i.i.d. for i G {1, . . . , M}. 

Consider 71. By definition, cf. Eq. ([53]) . -q^^ is such that for Xi ~ Poisson>2(Ty*) we have 
E[Xj] = ak = mk/n. Moreover, a G [2/A; + (5, 1] implies ry* G [(^1,^2] for some Ci = Ci{5,k) > 
and C2 = C2{k) < 00. From r/^, < C2 and using a local CLT for lattice random variables [Hal82| . it 
follows that P[X]ILi ^i = "T-^] ^ Cs/^/n for some C3 = 03(6, k) > 0. Thus, using Fact 17.51 we have 



T4 > vZ'^He'^* - 1 - r/,)"C3n-V2 . 



(62) 



Now, consider 72- Again use 77 = r/* in Fact 17.51 From ry* > Ci and again using a local CLT 
for lattice r.v.'s (Hal82j . we obtain FiJ^"^'! Xi = mk - 2{l + j)] < Ci/2Vn - I < C^j^ for some 
C4 = (74(5, k) < c«, since / < en. Thus, Fact 17.51 yields 



T2 < r?;™'=+'('+^\e''* - 1 - r?.)"-'C4n-i/2 . 

Fact 17.51 vields that 71 can be bounded above as 

Ti <7?-2('+i)(e'?-i-,;)' 

for any r] > 0. We will choose a suitable r] later. 

Finally, for 73 , similar to Fact 17. 5( we can deduce that 

^^<f(i + i)^+(i^i)!f ^-2a+,) 



(63) 
(64) 
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for all ^ > 0. Now, it is easy to check that 



< exp 



{G)«l' 



by comparing coefficients in the series expansions of both sides. Choosing ^ = d {I + j)/{mi^^), 
we obtain 



-<-mT- 



Finally, we have 




fn\ ^ n' / mA: \ (mfe - 2(/ + j))2('+j) 
\l)-l\' \2{l + j))- (2(/+j))! 


(66) 


Putting together Eqs. ([62]), ([MD, dMD, dSH]) and dMD, we obtain 




n\riC-^^)\-C (^^-1-^)' ^'^'^'^ /e(fc-l)(l + C75((/+j)/n))Y+^- 


m+m 


E[A(C,/,j)J^U ^2(^^^.) {e^*-l-r,,)i \ 2{l + j)k J 


iWm^ 



for some Ce = C'6(A:, J) < oo. Now, iV! > CjVNiN/e)^ for ah iV G N, for some Cj > 0. Using this 
with N = I + j, we obtain 

e V""' (2a+i))! < Vr+7 (2(/+j))! ^ yr+j _ /2(/ + m ^ C^2^^^+^) 



i + jj n - c, i\{i + 3)\ - c,v \{i + j)J - V 

for some Cs < oo. Plugging back, we get 

where 

Ts = 2g^ (""-l^-^) (i + ^5(1/ + j)/n)) 
^ 4(A; - 1)7,2 



Without loss of generality, assume 5 < 0.1. Now, we choose e = £{5,k) > such that e + 
e' < 6/{WC5). We choose r/ = r/(fc) > such that (e'? - 1 - r?)r/-2 < (1 + 5/10)/2 (note that 
{e^ — 1 — r])r]~'^ — )• 1/2 as 7? — )■ 0). This leads to Ts < 1 — S/2 for all / < en and j < e'n, when we 
use 02 < I — S. Also, Te < Cio/n for all /, j, for some Ciq = Cio{k) < oo. Thus, 

I ( C'lo \ ^ 



E[AA(G;/,j)]<C9(l-V2)' — 
Summing over j and /, we obtain 



Y,Y.^W{G;1.3)] < -^ (67) 

1=1 j=i 

for some Cn = Cii(A;, (^) < oo. This implies Eq. ()60p for large enough 77 as required. D 
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8 Proof of LeminalSiSi A sparse basis for low- weight core solutions 

For each Xq G Cc{sn), we need to find a sparse solution xG Si that matches Xq on the core. From 
Lemma 13.51 we know that w.h.p. , Xq consists of all zeros except for a small subset of variables. 
Indeed we know from Lemma 17.41 that these variables correspond to a cycle of degree-2 variable 
nodes. Although this is not used in the following, we shall nevertheless refer to the set of variable 
nodes corresponding to an element of Cci^n) as to cycle. Denote by Li the cycle corresponding to 
X(,. Recall that the non-core G^c = (-Pnc, VJjc, -E'nc) is the subgraph of G induced by F^c = F\Fc 
and T4c = ^\^c- Suppose we set all non-core variables to 0. The set of violated checks consists of 
those checks in Ffjc that have an odd number of neighbors in Li. We show that w.h.p. , each such 
check can be satisfied by changing a small number of non-core variables in its neighborhood to 1. 
To show that this is possible, we make use of the belief propagation algorithm described in Section 

m 

Our strategy is roughly the following. Consider a violated check a. We wish to set an odd 
number of its non-core neighboring variables to 1. But then this may cause further checks to be 
violated, and so on. A key fact comes to our rescue. If check node a receives an incoming * message 
in round T, then we can find a subset of non-core variable nodes in a T-neighborhood of a such 
that if we set those variables to 1, check a will be satisfied (with an odd number of neighboring 
ones in the non-core) without causing any new violations. We do this for each violated check. Now 
w.h.p. , for suitable T, all violated checks will receive at least one incoming * by time T (note that 
each non-core check receives an incoming * at the BP fixed point). Thus, we can satisfy them all 
by setting a small number of non-core variables to 1. 

Lemma 8.1. Consider G drawn uniformly from G{n,k,m). Denote by F''^' C Fj^c the checks in 
the non-core having degree I with respect to the non-core, for I G {1,2, . . . ,k}. Condition on the 
core Gc, and F^ for I e {1,2, . . . ,k}. 

• Then E'c.nc cmd Gnc are independent of each other. Here -Bcnc denotes the edges between 
core variables Vc and non-core checks F^c. 

• The edges in Ecuc are distributed as follows: For each a G F-^c, if a ^ F^^' , its neighborhood 
in Gc is a uniformly random subset of Vq of size k — I, independent of the others. 

• Clearly, (Gc, (i^'''Of=i) uniquely determine the parameters {nuc, R^^ , it^nc) of the non-core. 
The non-core G^c is drawn uniformly at random from ]D){niic,R^'^,m^c) conditioned on being 
peelable, i.e., G^c is drawn uniformly at random from ]D){n^c, R^^,miic) n "P. 

Proof. Each G G G{n,k,m) with the given {Gc,{F^''')f^-^^) has a Guc corresponding to a unique 
element of ©(nuo -R''^,"^nc) H V and -Ec.nc corresponding to a subset of Vc of size /c — / for each 
a G F^', for / G {1, . . . , k}. The converse is also true. This yields the result. D 

Proof of Lemma W^ Take any sequence (sn)n>i such that lim„_j.oo Sn = oo and s„ < en. If points 
(i), {ii) and {Hi) in Lemma [331 hold, let V^cycie denote the union of the supports of the solutions in 
£c(sn)- Let 

Ei=Ei,anEi,bnEi,c, 

El, a ={ Points (i), {ii) and {Hi) in Lemma 13.51 hold } 

Ei,fe ={ |F(')| > n/C2 for ah / G {1, 2, . . . , A:} } 

Ei,c ={ No variable in V^yde has degree exceeding log Sn } . 
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We argue that Ei holds w.h.p. for an appropriate choice of C2 = C2{k,a) < 00. Indeed, Lemma [3.51 
imphes that Ei^^ holds w.h.p. . Lemma 14.91 implies that Ei^f, holds w.h.p. for sufficiently large C2. 
Finally, Lemma |8. II and a subexponential tail bound on the Poisson distribution ensure Ei^ holds 
w.h.p. . 

Assume that Ei holds. Let sets of variable nodes on the disjoint cycles corresponding to elements 
of Ccien) be denoted by Li for i G {1,2, . . . ,\£,c{£n)\}. Consider a cycle Lj. Denote by Ojj, 
j £ {1,2, . . . ,Zi}, the checks in the non-core having an odd number of neighbors in Lj. (Thus, Zi 
is the number of such checks.). Call these marked checks. Given Ei, we know that Z^ < s^logSn, 
and that there are no more than s^ log s„ marked checks in total: 



\Ccisn) 

E 



Zi < S„ log Sr. 



Define 



E2 = { No more than n/s„ messages change after Tn iterations of BPq } . 

00 and Sn grows sufficiently 



By Lemma I4.10[ the event E2 holds w.h.p. provided lim„ 
slowly with n (for the given choice of (T„)n>i )• 
Let 



>oo -'n 



Bij = { Not all messages incoming to check Uij have converged to their fixed point value in T^ iterations } 
We wish to show that 



holds w.h.p. . We have 



^i,j ^ij 



<E 



{Gc,Ec 



E 



(^i,jBfj 



I[Ei,E2]> I[i?. 



hj 



Gc, Ecnc 



+ 



[E5] + . 



[E^2] 



Given E2, we know that the number of checks for which an incoming message changes after r„ 
is no more than n/ s\. Suppose aij G F^^' is a marked check. Then we have 



E 



I[Ei,E2]I[S, 



«jj 



Gc,Ec, 



NC 



< 



n 



< 



4|F(0| -C2SI' 



since all check nodes in F^' are equivalent with respect to the non-core, from Lemma 18.11 We 
already know that under Ei, the number of marked checks is bounded by s^logs„. This leads to 



Ujj iJij 



C2Sn 



[E5] + : 



[E1]"^0, 



implying Eq. (j68|) holds w.h.p. . 

Condition on Gc and i^cNC- This identifies the marked checks. Lemma [8?T] guarantees us that 
all checks in F^'^ are equivalent with respect to One- Suppose Ei holds. Define a ball of radius 
t around a check node as consisting of the neighboring variable nodes, and the balls of radius t 
around each of those variables. Similar to the proof of Lemma 13.111 {Hi), we can show that 



|BGBc(«ij)^n.)| < C. 



Tn 



(69) 
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holds with probabihty at least 1 — C4exp(— 2 "/C4), for some C3 = Cs{a,k) < 00 and C4 = 
C4^{a,k) < 00, for all marked checks Ojj. Thus, the probability that this bound on ball size holds 
simultaneously for all marked checks, by union bound, is at least 1 — s^ log s„C4 exp(— 2-^"/C4) — )• 1 
as n — >• 1 provided T^ — )• 00 and Sn grows sufficiently slowly with n. 

Suppose Eq. (|68|) and Ei hold. Consider any marked check a^ adjacent to v £ Li for any Lj. 
It receives at least one incoming * message at the BPq fixed point and since Bij = 0, this is also 
true after T„ iterations of BPq . Hence, there is a subset of variables T/(*-5') C BQ^^{aij,T„,), such 
that setting variables in y(*-?) to 1 satisfies aij without violating any other checks. Define 

V^^' = {v : V occurs an odd number of times in the sets (V )j=i} 

It is not hard to verify that the vector x^ ^ with variables in Li U V^^' set to one and all other 
variables set to zero, is a member of Si. If Eq. (j69p holds for all marked checks, then we deduce 
that IV^"^'] < C^"SnlogSn < Cn for Tn and s„ growing sufficiently slowly with n. Thus, x^.^ G 5i 
is c^-sparse assuming these events, each of which occurs w.h.p. . We repeat this construction for 
every Lj. D 
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A Proof of Lemma 13.41 

Lemma A.l. Assume that G has no 2-core, and let 

-' (n—m) X (n—m) 

where U and W are constructed as in Lemma \3.S\ we order the variables as U followed by W , and 
the matrix inverse is taken over GF[2]. Then the columns of M. form a basis of the kernel of S, 
which is also the kernel ofM. In addition, ifM^i^j = 1, then dciijj) < Tq. 

Proof. A standard linear algebra result shows that IC is a basis for the kernel of H. The bottom 
identity block of IK corresponds to the (n — m) independent variables w G W, and in this block a 
1 only occurs if the row and column correspond to the same variable, i.e. for i,j £ W, Kjj- = 1 
implies i = j, and thus dG{i,j) = 0. To prove the distance claim for the upper block of K, we 
proceed by induction on Tc. For a variable u € U that is peeled along with factor node a ^ F, we 
will reference u via the factor node it was peeled with as Ua- 

• Induction base: For Tc = 1, Mpu = Im and thus 



Up 



w 



-' {n—m) X (n—m) 

Since Tc = 1, note that ever variable node must be connected to no more than 1 factor node. 
Thus {M.F,w)a,i = 1 implies that factor node a was connected to independent variable node 
i. Thus, variables i and Ua are both adjacent to factor a, and consequently dciua^i) = 1. 
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Inductive step: Assume that Tc = T -\- 1 and consider the graph J(G) = {Fj,Vj,Ej) 
(recall that J denoted the peeling operator). By construction Tc(J(G)) = T, and thus by the 
inductive hypothesis the columns of 



^KG) 



/, 



{{n—ni) — (m—mi))x{{n—ni) — {m—mi)) 



-'((n— n.i) — (m— »ni))x((n— ni) — (m— mi)) 



form a basis for the kernel of IHIj(q\ , where Fj, Uj, and Wj refer to the set of factor nodes of the 
factor graph J(G), and their corresponding partition, respectively. In addition, (]Kj(G'))a^j = 1 
only if d^(^Q\{ua,i) < T. To extend this basis to a basis for the kernel of H, note that 



/, 



(n~'m) X {n~m) 



EIfi,C/i ^Fi,Uj 
Mf„U, 



HiTj^vKj 



/, 



{n—'m) X (n—m) 



\Ui\ 



-IHIfi,{/jI 



^~f1u, 



^Fj,Uj 



Mfi.Wi EIfi,Wj 

Mf^^w, 



^ (ri—ni) X (n—m) 
^Fi,Wi '^Fi,Wj + ^Fi,Uj^ 







^ (n—m) X (n—m) 



By construction if {^Fi,Wi)a,i = l, then dciua^i) = 1 < T. Consider the {a^i) entry of 
the matrix B = Mf-i^^Wj + EIi?j_f/j]K. A necessary condition for Ba^i = 1 is the existence of 
an edge between check node a G Fi and independent variable node i G W \ Wi = Wj (i.e. 
{^Fi,Wj)a,i = 1)) or the existence of both an edge between a £ Fi and dependent variable 
node j G t/j that is in the basis for independent variable i (i.e. {{IlFi,Uj)a,j = l, ^j,i = !)• 

We note that ii dj(^Q-j(ua,i) < T, then dciua^i) < T also, since E^ C E. Thus, if {M.Fi,Ui)a,j = 
1, 'Kj^i = 1, then dciua, i) <T + 1. Similarly, if iJ^Fx,Wi)a,i = 1, then dciua, «) = 1 as in the 
base case. Thus, if Kij = 1, then dciijj) < T + 1 = Tc. 



D 



A direct result of this is the sparsity bound given below. 



Lemma A. 2. For K constructed as in Lemma \A.1[ the columns of K form, an s -sparse basis for 
the kernel ofM, with 

s < max|BG(i,Tc)| 



Proof. By Lemma I A. H dG{a,i) < Tc is a necessary condition for ^a,i = 1- Thus, for all i G W, 
the ith column of K can only contain I's on the entries that correspond to variables at distance at 
most Tc from i. The result follows by taking a union bound over all i G W. D 

Proof of Lemma \3.4\ Let 



!C = L 



rFU 



lF*,W* 



L 



(n—m) X (n—m) 



where the matrix inverse is taken over GF[2]. If G* ^ G, then all degree 2 check nodes constrain 
their adjacent variable nodes to the same value. Therefore, all variables in the same connected 
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component take on the same value in a satisfying solution, i.e. for all u^, S 14, if Mx = 0, then for 
all i G V:f, either Xi = oi Xi = 1. Consequently, Mx = if and only if x = Lx^ for some x^ such 
that Qx^ = Thus {x}^', ■ ■ ■ ,x}'} is a basis for the kernel of HI if and only if x}"^' = Lx* and 
{x!;, , . . . ,x!; } is a basis for the kernel of Q. 

Finally notice that Lx^ has \v^\ non-zero entries for each v* £ T4 such that x^ ^^ ^ 0. Thus, the 

sparsity bound follows as a direct extension of the bound from Lemma lA.2^ and the columns of K 
form an s-sparse basis for the kernel of H, with 



s < max S{v^,,Tc{G^,)). 

■u*ev* 
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B Proofs of technical lemmas in Section [5] 

Proof of Lemma\5M Let w = aR'{l). Define f{z) = 1 - A(l - p{z)) = 1 - exp(-ai?'(l)/9(z)). We 
obtain 

/'(O) = 2aR2 (70) 

Now, we know that zt — )• as t — )• oo, it follows that limj_!.oo zt+i/ zt — )• /'(O). We then deduce from 
peelability at rate r] that 

/'(O) < 1 - r? (71) 

Combining Eqs. (i70]l and (fTTI) . we obtain the desired result {i). 

In order to prove {ii) notice that, for the pair to be peelable, need z <1 — exp(— ai?'(2;)) for all 
z £ [0,1], i.e. 

R'^Hx) < 1 - e""^ ,for all x G [0,R'{1)], (72) 

where R'~^ is the inverse mapping of z i— )■ R'{z). We next integrate the above over [0, i?'(l)], using 

(73) 



f-R'W /■! 

/ i?'-^(x)dx= / wR"{w)dw = R\l)-l 

Jo Jo 



R'il) 1 

(1 - e""^) dx = R'{1) - -(1 - e-°-"'(^)) . (74) 

a 



We thus obtain 



l>l(l_e-"«'W), (75) 

a 

which yields a < 1 - e""^'(^) < 1. D 

Proof of Lemma \5. 31 We use the notation rn{G) = {'mi{G))^^2 whereby mi{G) is the number of 
check nodes of degree I'm G. Let 



nf = ni{Jt) , 4*) = n2{Jt) , m^ = m{Jt) , 

k k 

= ( Y^ mf ^) /n , i?f ) = mf V ( ^ "i?) for / G {2, 3, . . . , A:} . 



«=2 i'=2 
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Note that R^''' defined above is, in fact, tlie check degree profile of Jt- 

As above, let J(-) denote the operator corresponding to one round of synchronous peeling (so 
that Jt = J*(G)). Define the set 

S{G;m, hi, 712) = {G : ni{G) = hi,n2{G) = ri2,rn{G) = m, J(G) = G}. 

We prove the result by induction. By definition, we know that Jq = G is drawn uniformly 
from the C(n,i?, on). Suppose, conditioned on rn}^>,n\,n2, the graph Jt is drawn uniformly 
from C(n, i?'*-*, a^'''n). Let the probability of each possible Jt (with parameters {nv-^' ,n\ ,712 )) be 
denoted by q{rrv-^' ,n\ ,712 ). Consider a candidate graph G' with parameters (m', n'^,n2). We have 

P[Ji+i = G']= Yl IPt'^*] 

JfJ{Jt)=G' 

= E E nJt] 

m,ni,h2 Jt&S(G';rh,h\,h2) 

= ^ q{m,hi,h2)\S{G';m,hi,h2)\. 

m,h\,fi2 

A straightforward count yields 

|5(G';m,ni,n2)| = ("" ~ 'J^' ~ ""') " A' " coeff [(e^ - ir'Ke^)"^; z^-"i] • l[h2 = n'^ + n'2] , 

where A = '}2i=i{''^i ~ "^/)^- Thus, P[Jt+i = G'] depends on G' only through {rn',n[, 712). D 

To simplify the proof of Lemma 15.41 we first prove a simple technical lemma. 

Lemma B.l. Let G = {F,V,E) be a factor graph that is a tree with no check node of degree 1 or 
2, rooted at a variable node v, with \V\ > 1. Then \{u G V : deg(u) <l,u^ v}\ > \V\/2, i.e. at 
least half of all variable nodes are leafs. (Here a leaf is defined as a variable node that is distinct 
from the root and has degree at most 1.) 

Proof. We proceed by induction on the maximum depth t of the tree G rooted at v. 

• Induction base: For a tree of depth 1, let c = deg(v) > 0. Since all check nodes have degree 
3 or more, G has iVi > 2c leafs and \V\ = Ni + 1. Clearly, iVi > \V\/2. 

• Inductive step: Consider G having depth t + 1 and perform 1 round of synchronous peeling, 
resulting in J(G) = G' = {F' ,V',E'). Let N[ be the number of leafs in V. The inductive 
hypothesis implies \V'\ < 2N[, since G' is also a tree. Since, by construction, every factor 
node has degree at least 3 in G, every leaf in G' must have at least 2 leafs in G as descendants, 
i.e., 2N[ < Ni, where Ni is the number of leafs in G. Combining these two inequalities yields 

\V\ = \V'\ + Ni< 2N{ + iVi < 2Ni, 

as desired. 

D 
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Proof of Lemma \5.4\ By Lemma IB. 11 if G is a tree, at least one half of all variable nodes are leafs 
at every stage of peeling. Thus, G is peelable and Tc{G) < [log2 I^^H • (After [log2 I^^H ~ 1 rounds 
of peeling, we have 2 or less variable nodes remaining, and hence no checks. At most one more 
round of peeling leads to annihilation.) 

Now suppose G is unicyclic. Each factor in the cycle has degree at least 3, hence it has a 
neighbor outside the cycle and must eventually get peeled. Breaking ties arbitrary, let a be the 
first factor in the cycle to be peeled, and let u E 5a be the variable node that 'causes' it to get peeled 
(clearly u is not in the cycle). Let tu < ^c(G') be the peeling round in which u and a are peeled. 
Consider the subtree G„ = (F„, T4, £'„) rooted at u defined as follows: Gu is the maximal connected 
subgraph of G that includes n, but not a. Using Lemma IB. II on this sub-tree and reasoning as 
above, we have tu < [logg IKII < riog2 1^11 ■ 

As at least one factor node in the unicycle is peeled in round t^, we must have that Jt^ is a 
tree or forest, which by Lemma IB. II can be peeled in at most [log2 {Vll additional iterations, since 
the number of variable nodes in the Jt^ is at most \V\. Thus, Tc{G) < tu + riog2 1^11- Combining 
these two inequalities yields 

Tc{G)<tu+\log2\V\] <2riog2|l^n. 

D 

Proof of Lemma \5.5[ The lemma can be derived from known results (see, e.g., |AN72] ). but we find 
it easier to provide an independent proof. 

We use a generating function approach to prove the bound 

P [Zt > Wef] < 2 exp(-C7(/3/2)^) . (76) 

Equation (I34p follows (eventually for a different constant C) via union bound. 

Define /(s) = E[s^i] = Yl^o^'^^j- -^^ assumption, it is clear that /(s) is finite for s G 
(0, 1/(1 - 6)). Define /^(s) = E[s^*] for t > 1 (so that f{s) = f^^Hs)). It is well known that 

f^'Ks) = f{f^'''\s)) (77) 

for r > 2. It follows that /^(s) is finite for s G (0, 1/(1 - S)), and all r > 2. 

By dominated convergence / is differentiable at with /'(O) = 9. Hence there exists Eq > 
such that, for all e £ [0, eo] 

/(I + e) < 1 + 2ee (78) 

By applying the recursion (|77p and the fact that / is monotone increasing, we obtain, for all 
e G [0, eo] obtain 

f^'^\l + e)<l + i2efe. (79) 

In particular setting e = 80/(26)'^ , we get f^^\l + e) <1 + Eq <2. 
Finally, by Markov inequality. 



1) 



<2fl-|)""'' <2e-<'")"/2 



which concludes the proof. D 
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C Proof of Technical Lemmas of Section [6] 

Proof of Lemma \6.1[ We prove this lemma by induction. Let i?| and i?u be the result of t steps 
of backbone augmentation on graphs Gs and G with initial graphs B^ and B^ respectively. By 
assumption B^ C B^ ■ Now assume B^ C B\ . It is enough to show that if a G B^ \B\ 
then a G Bu ■ Since a G B^ \-Sj , we know that a ^ G and has at most one neighbor outside 
of -B[ . By induction assumption B^ C B\ and therefore a has at most one neighbor outside 
B\ . Hence, either a € -Bu or it is added to i?u at step t + 1. D 

Proof of Lemma \6.(A Define f{x) = 1 — exp{— fcax'^"^}. It follows immediately from the definition 
of ad(A;), that, for a > a^{k) , Q > for with f'{Q) < 1. Furthermore, a straightforward calculation 
yields 

f'{Q) = k{k - l)aQ^'-2 expl-toQ^-^ . (80) 

It is therefore sufficient to exclude the case f'{Q) = 1- Solving the equations f{Q) = Q and 
f'iQ) = 1) we get the following equation for Q 

-(l-Q)log(l-Q) = -^, (81) 

which has a unique solution Qif{k) due to the concavity of the left hand side. We can then solve 
for a yielding the unique value a = a^{k) such that f{Q) = Q and f'{Q) = 1 admits a solution. 
On the other hand, these two equations are satisfied at ad(^) by a continuity argument. It follows 
that a(^{k) = a* (A;) and hence f'{Q) < 1 for all a > ad(A;). □ 
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